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CHAPTER VII 


RESOURCE SHARING AND INTERCOMMUNICATION | 
AMONG COEXISTING PROCESSES 


.71 INTRODUCTION 


The earlier chapters gave us an increasingly larger view of an executing pro- 
cess, We began in Chapter 1 with the microscopic view that focused on the minute, 
but nontrivial details in the fetch and execute of individual GE 645 instructions in an 
executing process. Upon completing Chapter 6, we have managed to enlarge our 
view of a process in execution about as far as possible from the "in-vacuo" point 

of view taken thus far. That is, for the most part, we have been considering the 
process in isolation, as if it were the only one employing the computer system's 
resources, We know that each executing process coexists in some sense with other 
processes, some may be executing simultaneously on other processors (if there be 
more than one in service), some are waiting a "'turn'' to execute on a processor, 
and stiil others may be waiting for some event whose occurrence will enable the 
process to proceed with execution. The collection of these coexisting processes 
clearly implies (a competition for and) a sharing of hardware resources, a sharing 
of system software and data bases and control over this sharing. Most of the con- 
trol functions described previously were of a per-~process nature and the data bases 
considered were of a one-per-process type, e.g., siacks, descriptor segments, 
KST's, etc. In this chapter we will be examining controls of « per-system nature. 
Of course, the data bases that associate with these functions are central to the opera- 
tion of the entire system, Hopefully, when a subsystem designer understands how 
a process functions (cooperates and/or competes) in a milieu of other processes, 
he can better anticipate the performance of the processes in which his subsystem(s) 


resides, 


Types of Coexisting Processes 
In this overview section we shall anticipate what follows by summarizing the 
types of processes that coexist in Multics and provide a rough indication as to the 


nature of their interaction. We identify three kinds of coexistence. 


ee Sets of seemingly unrelated user processes, (Multi-programming. ) 


Experience with earlier operating systems including CTSS has shown that it is 
a rare console user whose process can fully utilize a fast processor, Characteristic- 


ally, a user process makes frequent requests for relatively slow-to~commence block 


-ransfers of information from drum, disk or other I/O devices. Even with devices 
that have high transfer rates, there is, to begin with, an associated latency of 
‘several milliseconds or more, i,e., a delay before the to- or from-core transfer 
may begin, During the delay and subsequent transfer time, it may not be possible 
for the requesting process to do any useful work, (This is certainly the case in 
-Multics when a segment or page fault has led to the initiating of the request for a 
drum or disk transfer to core of the desired segment or page thereof.) In principle, 
either the CPU must remain effectively idle while the process waits for completion — 
of the block transfer or the about-to-be idled process must somehow relinquish the 
processor to another process which is in a position to execute on a processor at 
this time. Systems for ''passing the processor around'' among several processes, 
so as to prevent the idling of a CPU during I/O waits or other delays, are known | 
as multi-programming systems. Multics is, among other things, a multi-program- 


ming system. 


The set of processes that share a processor in the fashion we just crudely | 
described need not be related to one another in any explicit way. They may, for 
instance, be a set of arbitrary user processes. Nevertheless, interaction among 
these processes is clearly implied. First, they share the same supervisory pro- 
cedures and certain system data bases (tables), as segments in their respective 
address spaces. Second, each process is compelled, while executing, to occasion- 
ally assist an idled process that is waiting for a particular event to "arrive." Al- 
though in each case the executing process must cooperate, the user should be, and 
iS, completely oblivious to the fact that his process is performing as a Good 
Samaritan, This is because all such activities will occur while his process is | 
trapped in certain ring-0 supervisory routines. Since these routines are common 


to all user processes, all processes are guaranteed to be Good Samaritans. 


There are several ways an executing process may know about the arrival of | 
an event which is of interest to another (non- executing) process. However this 
knowledge is acquired, the waiting process is alerted so that it may again c¢ enpere 


for time or. the processor (or on a processor, if there is more than one). 


An exccuting process may, for example, be interrupted (by an identifiable 
signal from another active unit) and in this way "tajd'' about an event of interest 
to another process. Should this occur, the interrupted process is forced to (in 


some sense) wake © up the process for whom the interrupt signal is of primary 


interest. Alternatively, the executing process may of its. own-discover the apparent | 
arrival of such events.. This type of discovery happens with great frequency in Mul- 
tics. Whena process incurs a page fault and must wait for the arrival of a requested 
page from the drum, it must put itself into a wait state. Just before doing so, it al- 
ways checks a certain system-wide I/O request list in which it can discover which, 

if any, paging requests (of other processes) have been completed. The executing 


process then "notifies" the aa waiting processes before making itself idle. 


2. User process and a system proces® 


Multice provides a set of ''ever-present'’ system processes that offer special- 


ized services to user processes. Among these, for instance, is the output driver, 


a system process that drives line pier and other output devices to produce | 


- copies of user- -~designated segments. A user will usually communicate implicitly 


with such a process by executing a system library subroutine call. This routine 
in turn executes the explicit steps needed to communicate an unambiguous work 


request to the system process so the user is, in fact, insulated from the details 


of interprocess communication. 


Because the system process is a separate and fully independent process, its 


functions may be achieved as a parallel operation. The system is free to fulfill the - 


requests it receives at its convenience. With the current implementation, the user 


process proceeds to other chores without waiting for an acknowledgment from the 


output driver that the requested output task is done. For illustrative purposes, how- 


ever, we can also imagine that when requested such a system process could send a 


meaningful completion signal to any user process. The latter might either wait for 
and be awakened by the completion signal or periodically inspect a special "mailbox" 


for the presence of a message sent by the former to indicate completion sa the task. 


as ‘Sets of deliberately cooperating processes. 
- The computation structures of many algorithms exhibit parallelism that can 
never be taken advantage of when using only one processor. There is an increasing 


interest in the computer community in providing the operating system machinery 


which would permit parallel computation. The puuvee: design provides a simple 


capability of this type. 


The ordinary programmer notices parallelism at var ious levels, rer the 
PL/I statement level on up to the subroutine. Thus, in the righthand side of the 


statement 


= (AXB + C)/(D*F + E); 


computation of the numerator can, in principle, be carried out in parallel with that 
of the denominator. Likewise, but at a more macroscopic structural level, it is 


conceivable that in the statement | @ 


‘T = det(A)* EOB\A)s 


the function for ceaues the determinant of the matrix A could be executed in 
parallel with the computation for the cosine of x, if each subroutine could be invoked | 


to execute in parallel on separate processors. 


If one could invoke separate and parallel computations, a mechanism for s syn- 
chronizing the two parallel functions would also be needed. Thus, the multiplication 

of det(A) and cos(x) clearly must be delayed until it is known that both subroutines 

have returned values. Suppose, for instance, we consider flow structure as depicted 


in Figure 7-1. 


invoke det A asa separate 


and parallel action 


i A 
computation of det(A) 
4 ; — 
| | inform main sequence 
No ( is det (A) of completed task 
com pleted 
Yes 


Figure 7-1 Invoking and synchronizing a parallel action 


Boxes 1, 4 and 5 are represertative of the logic required to invoke and to synchron- 
ize the parallel action. Boxes 1, 4 and 5 can be carried out only if some common 
data cells are shared between the two computations for use, so that one computation 


can communicate with the other. 


Clearly, there is some trade-off between the savings in time that can be 
achieved in the parallel computation and the extra costs associated with use of 
the machinery for achieving these gains, In Multics, opportunity for such com- 
putation is provided. Parallel computation can be achieved by executing two or 
more processes concurrently. However, ina Multics system with several CPU's, 
the user is never given an opportunity to force their simultaneous allocation to his 
separate processes, Hence, in Multics, ''parallel computation" is just a possibility, 
never something that can be guaranteed. Although it is more realistic to regard _ 


the execution of such separate computations as asynchronous rather than parallel, 


we shall generally use the latter term and understand it in its properly qualified 


sense, 


Machinery for interprocess communication is provided in Multics by which | 
to create and/or invoke other processes and also to synchronize with other processes 


using shared data bases known as "event channels.'' With this machinery, meaning- 


ful cooperation (e.g., parallel computation) can'be conducted. (The scope of the 


: parallel tasks may be large or small, as the user or users desire. ) Explicit user 


programming, of a type to be described in this chapter, is required to achieve this 


objective. Moreover, the programming for each such "subsystem," for we can 


indeed regard such planned cooperation asa subsystem design, is specific to the 


objective at hand. 


In review of the above three types of coexisting processes, we see that in all 


cases: 
ine "Cooperation, '' whether voluntary or involuntary, preplanned by 
the system or explicitly planned by the programmer, implies com- 


munication between processes through the use of shared data bases. 


rs Of necessity, all coexisting processes, whether they cooperate or 
not, also compete for processor time and core space. * | 


3. By design, all processes share common supervisor modules and 
certain system tables, 


For gaining additional perspective, it is well to consider how Multics differs 


in its design approach for achieving and controlling parallelism from the approach 


* We make the implicit assumption that nearly always there are more processes 


able to execute than there are available processcrs. A similar assumption is made 
with respect to core space, i.e., there is not enough to "go around, "' 
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taken in more traditional operating systems. 


In earlier operating systems, parallel operations were limited to I/O activi- 
ties, hence the mechanisms for controlling parallelism (interrupt handling and 
dispatching) became embedded in supervisory packages called I/O Control Systems. 
Of course, in the context of the more "modern"! multi-access, multiprocessor © 
systems the same or equivalent mechanisms together with some new ones are 
also inherent in (a) achieving the multiprogramming of unrelated coexisting pro- 
cesses (type 1), and (b) in the invoking and synchronizing of parallel computations 
| in coexisting processes (type 3), The additional mechanisms include appropriate 
locking controls on shared data bases and the means of communicating (e.g. , send- 
ing signals) between the independently operating hardware processors, For this 
| reason, Multics has split out the traditional aspects of I/O Control Systems having 
to do with parallelism (interrupt handling and dispatching) and has combined these 


with the other aspects of parallel processing. 


The combination resulting from this unified viewpoint has led to the dev elop- 
ment of a single, general purpose supervisor subsystem for control of all parallel 


operations, This subsystem is known as the Traffic Controller. * 


7.2. MULTIPLEXING PROCESSORS 


Our immediate goal is to see how Multics sei svee the orderly and effective 
| multiplexing of its processor(s) among the coexisting pr ocesses. A set of modules 


referred to as the Traffic Controller is responsible for this activity. 


We shall learn about the functions of the Traffic Controller (or TC) in an 
incremental fashion. First, we view those functions needed to support multi- 
programming. These are the mechanisms to give away and get back a processor 
when predictable time delays. such as those due to paging, are forced on a process. 
Next, we shall consider the IC viewed as a generél mechanism for time sharing, 


for interprocess communication, and for achieving still other control functions. 


ere nomen 

* The Traffic Controller was first formulated by J. H. Saltzer in a lucid Ph.D. 

dissertation (MAC TR-30, July 1966, "Traffic Control in a Multiplexed Computer _ 

System"). A modification of Prof. Saltzer's original design, developed in the 
M.S. thesis by Robert Rappaport, has been incorporated in Multics, The prin- 
ae MSPM references are the BJ sections. : | 
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7.2. i A Simple Mechanism for Multiprogramming 


Consider a set of n seemingly unrelated user processes ina system of k 


© processors (k <n), Each process coexists in one of three ''execution states": 
running, ready, or waiting. 
A running process is one that is currently executing on a processor. (At 


most, k of the processes are in the running state. ) 


OA ready process is one that would be running if a processor were available 


for itto runon, (There are at most n-k ready processes. ) 


Since we picture that processes will frequently incur page faults, some of 
the n-k non-executing processes will be waiting for the completion of previously 
invoked paging requests (normally from the drurn), So, we define a waiting pro- 
cess as one that cannot make immediate use of a processor (even if one were 
available), because it is waiting for a so-called system event to happen. Arrival 
of a page in core is an example of a system event, which can be defined as an 


event which 


! 


(a) is of interest to at least one coexisting process* and 
- (b) ~~ the waiting time for whose occurrence has a predictable upper bound, 


| @ Waiting processes compete for processors in the sense that once the waited-for 


event has occurred, the processes should then be regarded as being ready. 


_ The Traffic Controller's tasks for multiplexing processors among this class 
of processes are conceptually simple. Its activities center around the maintenance 
of a list of the coexisting processes, (This list is called the Active Process _ 
Table, APT). For each process on the list, the Traffic Controller (TC), associ- 
ates the current execution state and other vital data. Thus, for a process that is 
marked as running there is also recorded a code that identifies the particular | 
processor on which the listed process is now executing. Entries for processes 
that are marked ready can be pictured as belonging to a so-called '"ready list. '' 


Although we shall examine this list in greater detail later, for the moment it is 


* ‘Two processes could conceivably take page faults for the identical page of the 
same shared segment. The page fault in the second process could occur after the 
page fault taken by the first process, but before the page request initiated by the 
first process has been completed. The net result is that when the system event — 
(completing the page request) finally occurs, it will be of interest to two waiting 
processes, | oo 7 aaa 


| 


best to view the fees Het as a FIFO-managed list. Finally, for each listed process — 
that is marked as waiting there is recorded an identifier for the event being waited 


for. 


Whena Goa Samaritan process (executing in ring 0) notes that a: ‘waited- for 
system event has arrived, it calls the Traffic Controller to 'notify" the appropri- 
ate process(es) that has (have) been waiting. The Good Samaritan is, in nearly all 
instances, any process that may have just taken a page fault. While trapped in the 
ring-0 supervisor for the purpose of initiating a page request, this process auto- 
matically scans a system-maintained list of (drum /disk) I/O requests in search of 
those that are marked as satisfied (i.e., done). If any are found, the TC is called, 
giving it an identifier for the event that has occurred. The TC then uses this identi- 
fier to notify the appropriate processes in the following manner. For each given 
event identifier the TC locates on its list of processes those that are waiting for 
the identical event. For each such process, the TC then alters the code for its 
execution state from waiting to ready, and makes this process a part of the ready 
list. Eventually, this entry becomes topmost on the ready list. (Remember, we 
are thinking of it as a FIFO list). The TC will select the associated process for 
execution, and when this happens, the APT entry is recoded as running, thereby 
removed from the ready list. The code for the processor given to this process — 


is also recorded inthis APT entry. 


In summary, we see that notifying a process ofa system event does not im- 


mediately place it in the execution state. A process must first pass through the 


read y state enroute from waiting to running. 


It may have occurred to you to ask the following question: Once put into the 
running state, is there any jron-clad guarantee that the page for which a process 
had been waiting will, in fact, be there? There is some possibility that in the 
interim, between the time the process was first 'notified" that its requested page 
was in core and the time it finilly reacquired a processor to reference that page, 
the page has again been removed from core. This situation could arise in the 
following way. Let ''x'' be the page in question. Then, during the said interim, 
which could be a long one, other running processes may have page demands that 
are satisfied by ‘pushing out'' page x. If in fact this situation were actually to 
occur, the victimized process, when it regained the running state, would re-_ 


execute the original faulting instruction and again cause a page fault. The running 
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process would reinitiate the same page request, play Good Samaritan to call the 
TC to notify others of completed page requests, and again call the TC to "put. 


itself" into the wait state and give up the processor to an eligible (ready) process. 


The hypothetical situation just described pictures a process cycling through 
the running-wait-ready states without ever accomplishing anything but page faults. 
This situation is one of several types of ''system thrashing" which the system de- 
signer is always bent on preventing. Thrashing is prevented in Multics in the 
following way. The number of processes eligible for CPU attention is kept below 
a limit which, if exceeded, would cause thrashing. This limit would be respected 
even if it means allowing a CPU to become idle occasionally. Moreover, a pro- 
cess that is forced into the wait state never loses its priority relative to the other 
eligible processes. So, when the process has been notified that it may resume 
execution, there can be at most a limited number of processes queued ahead of 
it. The possibility that the desired page would be pushed out before it can be used 
by each process that has faulted on it is thus made negligible. We defer until 


Section 7.4 a full elaboration on how this fine control is achieved. 


leaue The TC Used for Time Sharing 


If a running process is executing in a computation loop (deliberate or acci-— 
dental) such that it takes no pape faults over an extended period, what prevents 
this process from ''monopolizing'" the processor? The scheme for multiprogram- 
ining that was described in the preceding subsection does not indicate any way 
that other processes (ready or waiting) can get their "turn."' Clearly, an addi- 
tional mechanism is needed to force a sharing of the processor on the basis of 
elapsed time in execution, This mechanism is provided inthe TC by assigning 
to each coexisting process an appropriate execution time allotment, q, such that 
when a process has executed for a total of q time units, it is forced to give up 
the processor. The time allotment is a value that is under the control of the | 
system administrator. The control is achieved with the help of hardware as 


elaborated slightly in the following discussion, 


on First, it should be noted that the time allosment, q, for each process is 
assigned by a module of the TC called the ''scheduler.'' The value for q is 
stored in the Active Process Table entry for the pxocess. Also kept in the same 
entry is the amount of time, 1, which has already been used in execution azainst 


the allotment, q. Whena process enters the running state, one can picture that 


79 


| 


| 


the TC sets a timer register with the value q-r. The timer register then counts 
down to zero. When it reaches zero, a combination of hardware and software 
causes the generation of a process interrupt. The interrupted process then calls 


an appropriate entry point in the Traffic Controller which will 


iF '+eschedule' the now running process for execution at a later time, 
and : 
2. give away the processor to the "next'' ready process. 


The term ''rescheduling" refers to the tasks of 


(a) giving the interrupted process a new time allotment q' (not 
necessarily equal to its last value of q) for use the next time 
- it is allowed to enter the running state; 


(b) putting the process into the ready state by marking the execution 
state as ready, resetting the value of the time us ed (r) to zero, 
and making other updates to its APT entry. The business of 
deciding where on the ready list to place a process is discussed 
in Section 7. 4. 

When a running process moves to the wait state because it has incurred a - 
page fault (or is forced to wait for some other system event, €.§8., the unlocking 
of a system table), the value of r is kept in its APT entry and is incremented by 
an amount inferred from the reading of the timer register. Typically, the pro- 
cess! current time allotment is, in fact, used up over a sequence of short execu~ 
tions, each punctuated by a page fault or other system delay that causes the pro- 
cess to pass through the wait and ready states. Eventually, the time allotment — 
is used up, at which time the process must be "rescheduled, '' Other things being 
equal, if a process must be rescheduled, it is giver a larger time allotment, but 
it is also given a lower priority, which in effect means that its "insertion point" 


in the ready list is made corre spondingly less favorable, 


In addition to giving a more detailed look at rescheduling, Section 7.4 also 
describes an eligibility restriction and a set of pre-emption mechanisms. 
Eligibility refers to the depth or "degree" of multiprogramming, It is the number 
of processes that are permitted to compete for a processor at any one time. The. 
number of eligible processes is necessarily restricted in order to prevent thrash- 
ing, 1e.; destructive competition for the limited core resources. In simple 


terms, eligibility can be viewed as a conserved resource of the system that is 


_ 
passed about among the processes—like the fixed and limited number of lunch 


@ 


trays that is circulated among the much larger number of daily customers that 


pass through a cafeteria. Eligibility is first conferred ona process when that 


process reaches a certain preferred point on the ready list. Eligibility is later. 
withdrawn from a process when it is rescheduled (moved) to a less favorable 


position on the ready list or when it leaves the ready list for a long time. 


Pre-emption refers either to the capture of a processor (CPU pre-emption) : 


or to the capture of eligibility (eligibility pre-emption). The former allows an 


eligible process that is being readied, if it is 'important'' enough, to cause the | 
capture ofa processor. Capture by a high priority eligible process is either im- 


mediate, in case it is being notified that a page read has been completed (or some 


- other system event has occurred) or at the end of the next time unit (currently one 


teee 3 Block, Wakeup Functions for Use in I/O Control and in General 


second), in all other cases. 


CPU pre-emption in the current implementation of Multics occurs in a com-_ 


pletely automatic way. For example, whenever an ineligible process that is being 


rescheduled has a high enough relative priority, that process becomes a candidate _ 


to pre-empt a processor, The process will, in fact, pre-empt a processor if a 
search of the APT entries for those processes now running (including that of the 
process doing the searching) reveals one (process) that is ''less important.'' In _ 
a multi-processor configuration, if there is a choice, the running process that is 
''pre-empted'' will be the one of lowest "importance, "' Any process that is CPU- 
pre-empted is rescheduled (e.g., put back on to an appropriate point in the ready 


list). 


Eligibility pre-emption is also automatic, permitting a high-priority but 


ineligible process to capture the eligibility of a lower priority running process, - | 


Interproc ess Communication 


“Two more mechanisms are provided in the TC that are designed mainly to 
facilitate the synchronizing of deliberately cooperating processes, These are 
called the block and wakeup functions. | They are functionally different from the | 
previously described wait and notify functions. The wait, notify mechanisms, 


which can be called only from ring 0, allow a process to wait on (and later be 


notified of) system events; block, wakeup mechanisms allow a process to wait . 


on (and later be notified of) so-called process events, 


| By a process event we mean an occurrence that can be of interest to only a 
specific process (or set of specific processes), The waiting period for such an 
occurrence will not, however, be either predictable by the supervisor or bounded, 
To retain the distinction between the two types of Waiting, we say that a process 
enters the blocked state when a process begins waiting for a process event. We 
will say that the process receives a wakeup when it is notified of the occurrence 


of that event. 


Before going further into detail of these TC mechanisms, it will help to con- 
| dee an illustrative example (somewhat contrived) of a user process synchronizing 
its activities with a hypothetical system process that manages teletyped I/O (tty. 
-manager)*, We will picture an I/O operation involving the typing out of a string 


of several thousand characters. 


Suppose the tty manager shares a segment with the user process that can be 
regarded as an output buffer. For simplicity, let it be 270 characters in length. 
We picture that the tty manager copies characters out of the buffer in amounts 
that range up to 270 characters at a time (enough to type up to three full lines on 
a certain brand of teletype). The user process attempts to move up to 270 charac- 
ters of the long output string into the smaller buffer area on each transit through 
its write loop. With the aid of pointers into the buffer, each process is able to 
interpret the information in the buffer in an appropriate way. Thus, the pointers 
identify and delimit the next group (up to 270) characters in the buffer which may 
be moved out (in groups of up to 90 characters) by the tty manager. The same 
or other pointers tell the user process which set of spaces within the buffer are 
"open,'' i.e., may be filled with the next group of characters from the output 
string. — Two situations are apt to arise. 

a. The user process may find at this instant that there is not enough 

room in the buffer for the next group of i<270 characters to be 


copied into it. We will presume that under these circumstances, 
the user process would want to place itself into the blocked state 


* Readers should realize thet in the present implementation of I/O Control in 
Multics, I/O supervisory procedures that contro! teletypes are part of each user 
process. No manager process is needed as a 'middle man" or broker to execute 
these I/O functions. The hypothetical example of the manager process, once 
thought to be useful for syst2m-wide service, is instructive and may prove ap- 
prvee Pie in the design of special subsystems. 


until the tty manager has had an opportunity to "empty out" 
enough of the buffer to provide the room needed by the user 
process, 


b. The tty manager may find it has emptied out the buffer, ie., 
there is no information in the buffer to be moved out. Inthe 
event there is nothing else that the tty manager can do, it 

would then have to wait for the arrival of more data into the 


buffer. The tty manager would then want to put itself into © 


the blocked state to await the desired event. 


Note (according to our earlies definition) that the events for which both the 
user and manager process might wait are process events. Thus, no other process 
but the manager will be interested in being notified that there are now characters 

in the buffer which are to be copied out onto the teletype. Moreover, there is no 
| general way to predict how long the manager might have to wait for this notification, © 
because the user process may incur various delays (including delays due to paging, 
due to computations of arbitrary length, or due to entering the blocked state) in the 
course of cycling through its write loop. Thus, the user process might be pre- 


empted or timed out during any one transit of its write loop. 


There is, in fact, a symmetry here in the synchronizing of these two pro- 
cesses that can easily be seen, If one process, say the user, blocks itself, the 
other process (in this case the manager process) wakes up the first process and 
vice versa. Also, note the important implication that when each process blocks 
itself it is counting on the other process to ''wake it up. '' It is usually unimportant 
for the blocked process to know when it will be awakened, but it is always crucial | 


for that process to know it will be awakened. 


We are now eer re to see how the mechanisms block and wakeup that are | 
provided in the TC would be applied. Our initial view will of necessity be greatly | 
simplified. A more complete description is given in Section 7,5. Figure 7-2 will 
be helpful for our present purpose. It sketches some of the details in the write | 
loop of the user process and in the synchronized read loop of the tty manager, 
which, taken together, characterize the buffered write operation, The synchron- 
izing steps (loops) being described here are generated as a result of using ordinary 
source language I/O calls. In this chapter we shall not consider how the I/O con- | 


trol system converts user-wriiten calls such as: 
call write out(strirg); 


into the steps being described in Figure 7-2. 


Write Loop of the User | | Read Loop of tty Manager 
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Figure 7-2 Schemat.c showing the synchronizing of a user 
3 process withthe tty manager process for a 
write operation (simplified view). 


It is advisable to begin the discussion of Figure 7-2 at box 1. If the user 
process fails the test in that box (i.e. , no place open in the buffer), it calls the 
"block! entry of the TC, The three dots on the line between boxes 1 and 2 and 

@ between boxes 2 and 3 are provided to suggest that the call to block and the return 
from it are really handled indirectly, i.e. , through a chain of intermediate (super- 
visory) routines to be described in Section 7.5, In actual fact, auser will not be 
aware that his process is calling block, The argument in the call is a returned 
pointer to a location where A can expect to find a ''message'"' signifying the evantts) 
being waited for has (or have) arrived. Basically, what the TC does when called 


at box is the ronroN aay 


The APT entry Te process A will be marked blocked aaa the processor will 


be switched to the process which is currently at the top of the ready list. 


| Process A has now been taken out of the running state and hence cannot re- 
turn from its call to block (box 2) until after the event it waits for has arrived. 
This is why the line from box 2 to box 3 is shown with a break (—*), A can re- | 
turn to the running state and thereby "jump the line break" in the flow diagram 
(so to speak) only after process B has executed a call to the wakeup entry of the 
TC (in box 6). Inthis call, process B names as arguments A's unique process 
identifier and a message, message n, that represents the event A is expecting. 
@ The message must later be recognized by process A before A can re-enter the 


running state. 


The TC upon being called at box 6 will place the given message nina system= 
wide shared data base that is, of course, accessible to process A, The TC also 
places a pointer to the message in the APT entry for process A. Here, we bear. 

in mind that only the TC is allowed access to the APT (which is also a system- 
wide shared data base). Next, the ine places process A in the ready state, making 
A's entry part of the ready list. Wakeup now returns to its cailer and execution 


in process B proceeds through box 7, 


Now that A is in the ready state, it can compete again for a processor. 
When A subsequently gets a processor, it will resuine execution within the TC 
module in which it (A) was last executing and then return from block at the exit 
of box 2. Block returns the po ‘inter to the message sent by Basa return argu-~ 


ment (i. e., the second argume it in the call), 


intermediate step between box 2 and box 3 which was omitted 
» keep its structure as simple as possible during a first view. 
to box 3, an intermediate system routine makes a check to be 


expected event message and not a spurious or irrelevant one 


ived,* If spurious, then box 2 is repeated. The schematic 


aces box 2 and the delay that follows. 


call block (argl, 


location for message _n) 


2a 


expected \_ 
message? 


Yes 


More detail for Bes 2 


ucture is appropriate to replace box 9, 


ll expand the detail of box 10 in the model given in Figure 7-2. 
at having initiated teletype output, the manager process can do 
this relatively slow output operation is completed. We suppose — 
ircumstances it is appropriate for the tty manager to give up 
salling block, In this case it would be more realistic to consider 


n Figure 7-4 to replace box 10, 


process P may be in communication with more than one other 
and C, At any one point, however, process P may enter the 
wait a wakeup signal from B. Suppose that shortly thereafter 
eup signal, “or whatever reason. Chaos would then result if P 
id permitted to proceed on the assumption that its expected signal 


received. 


| initiate an 1/O operation (type ona teletype 


| @ _ | the set of up to 90 characters now in the 
| | ? write buffer. ) 


ll 


call block (argl, location_fo r_event_p) 


lla 


no: expected 
| message? 
| Yes | 
to | | _ sacs =< 
box 6 | - -‘\reset pointers in the write buffer 
® s,s Figure 7-4 More detail of the initiation of an 1/0 operation 


How, then, will the tty manager receive word of the completion of the I/O opera- 
tion? That is, who (what process) wakes up the manager so that execution may | 
proceed to box 12? The user process, A, may itself be in one of the nonrunning 
(blocked, waiting or ready) states while the tty ma’ aager is blocked. Thus, pro- ; | 
cess A cannot be counted on for any help. Clearly, some "third" sRoeees must 
be involved. In the Multics 1/O system design, the third process is any process | 
that happens to be executing on the processor when it receives a hardware inter- | 
rupt signal that is intended to indicate completion of the invoked I/O operation. 
The executing process is forced to play a Good Samaritan role because all system | 
interrupt signals (I/O completion signals are examples of such interrupts) are 
handled by the Traffic Controller. By design, an 1/O completion signal comes 
into the GE 645 memory and triggers the interruption (trapping) of whatever pro~ 

: cess happens to be executing on the affected processor. An invoked interrupt 


interceptor module then converts this signal into a wakeup call to the TC, identifying : 


the process that should be waked up and providing it a message that signifies the 
device on which the I/O task has been completed. Just as soon as the call to wake- 
up is completed, control returns from wakeup, and routine execution of the inter- 
rupted Good Samaritan continues. The foregoing concepts are suggested in Figure 
7-5, 


The test in box 2a of Figure 7-3 (and in box 9a, if it were drawn ina simi-_ 
lar fashion) suggests an essential characteristic of meaningful communication 
among coexisting (and cooperating processes, A process may receive more than | 
one message or signal (from one or more processes). Each legitimate signal - 
could have different significance to a receiving process. In most instances it is 
essential that the reawakened process be able to identify the sender and the nature 


of the message, if proper interpretation of the "' reawakening"' is to be made. 


| For example, consider our hypothetical tty manager as the receiver of 
messages. Such a process could serve not just one user, but all users who are 
using teletype consoles for output or input, In that event, the loop (boxes 6 through 
10) of Figure 7-2 would clearly be an oversimplification. When awakened the tty 
manager must identify which user process is sending a message and moreover 
which type of message it has received, so that it can act accordingly, i.e., so 

it can resume a read loop to initiate more output on the teletype or so it can re- 
sume a write loop to initiate more input to read a buffer from the teletype—for 
some process. While the manager is not running, it must somehow be in a posi- 
tion to receive such messages in an orderly way, so that when again in the running 


state the message(s) received in the interim can be properly interpreted. 


| Multics provides a general mechanism known as the IPC (interprocess 
communication facility) to achieve the transfer of messages (signals) between 
processes, "Receipt" of messages can occur while the process is in any execu- 
tion state, because the sender, using the IPC facilicy, can place a message in 

a shared data parse which the receiver will examine and interpret at a later time, 
also with the aid of the same IPC facility. ** Subsystem designers will have little 


interest in the details for transmitting messages between user processes and 


* The shared data base and its manipulating procedures are necessarily in ring-0O 
for protection reasons, | 
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Figure 7-5 Illustrating the conversion of sy3tem interrupts to 
| process wakeups 


system processes like the I/O driver, since ee are entirely ‘sentroiied by 
built-in functions of the 1/O control supervisory procedures. On the other hand, 
the same techniques for interprocess communication also apply to subsystems 

in which two or more user processes must comunicate with one another for effec- 
tive operation, Here, the designer must provide the explicit calls on the IPC 
facility. For such subsystems, the designer must become more fully acquainted | 


with the IPC. Section 7.5 provides the basics, | 


A final observation is in order in this introduction concerning interprocess 
| communication, This has to do with the distinction between data communication 
and control communication. In the example of Figure 7-2, the data passing into 
and out of the write buffer may be regarded as data communicated between the 

two processes. The messages transmitted by the makeup function and examined 
by the block function, though also data in one sense, nevertheless serve as con- 
trol communicaticns in that their net effect, like stop and go signals, permit the 
starting- up of a blocked process. There is an analogy between these two types 


of communication and two types of computer instructions, Control communication 
a fr ead , 


corresponds toa 
, write 


memory" type of instruction, 


7.2.4 Other Control Functions of the Traffic Controller 


The Traffic Controller contains modules needed for the purpose of creating 
processes, for destroying them, and for halting processes in anticipation of des- 
troying them. Additionally, the TC is able to cause the loading of a process. 
Loading a process amounts to placing in memory a limited number of Sclectee 
segments, page tables, and other information whose guaranteed presence in 
memory is essential if the process is put into the running state. We shall refer 
to this set of process information as the minimum core image, (MCI), Among 
the components of the MCI are the APT entry for the process, a ring-0 descriptor © 
segment, and a special ring 0 process state segment named PDS (Process Data 


Segment). 


Generally speaking, the subsystem designer need pay little attention to | 
these essential supervisory functions, since they must be car ried out as a matter 
of course in normal operations. Thus, during log-in a so-called ‘working process'! 
is automatically created for the user and during logout that process is destroyed. 


Moreover, it is also the responsibility of the Traffic Controller to see to it that a 


process reaching the top of the ready list has a minimum core image.* In other 


words, loading of the MCI is supervised on behalf of the working process when-| 


ever necessary. 


With all this machinery for "managing processes'! already necessary (and 
available) as supervisory functions, it is not surprising that the Multics design 
is aimed at giving a sophisticated user the opportunity to exploit some of these 


functions for his own purposes. 


Two types of user applications are envisioned. The first is almost funda- 


mental because of its relationship to console debugging. The second relates to 


the user's management of a subsystem in which one process spawns others. 

os Stopping a process so as to debug it 

| During a console session the user will often find cause to stopa process 
now in execution (running, ready or blocked). He may notice, for instance, that 


his working process is in an endless (or undesirably long) output loop, suspect 


an endless computation loop is in progress, or for other reasons wish to halt the 


- process and take stock of the situation, i.e., enter into certain on-line debugging 


activities. The Multics design makes it feasible to carry out such console debug- 


ging by providing the user a sirmple-to-use facility to accomplish the following: © 
a : a. Cause his current working process to be ''stopped'', 


b. Cause a new working process to be created and activated 
on the user's behalf which will now respond to his console 
commands. (No new login is necessary, mind you.) The 
new process can now be used to "inspect'' segments such 
as the stacks of the stopped process, using debugging pro- 
cedures that execute in the newly created process. : 


c. When Multics is fully implemented, a user will be able to | 
achieve steps a and b simply by pressing the quit button 
and then issuing a ''save'' command on his console, The 
effect will be to signal an always-coexisting process, called 
the answering service, and asking it to do these chores. 


* The loading task is carried out, in fact, by wakirg up a system loading process 
to do the job. This special process is of the highest priority and is itself always © 
loaded. Consequently, the Traffic Controller's request for the loading of a pro- 
cess will receive relatively quick response. We will add more detail to this 


frame of reference in Section 7.4. 
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c. If, after inspecting the stopped process, the user deems it 
'resumable", possibly after ''doctoring'' one or more of its 
segments in some fashion, then he may destroy the new or 
current working process and resume (put back into the ready 
state) the old working process. This step, of course, implies 
that pas console will be reattached to the resumed process. 


Step c would be aceemolened by typing a ne command. 


d. Aivetuete decisions might be either to save the old process 
for future resumption or save certain of the temporary files 
of this process, Saving a previous working process or any 
of its temporary parts is simple enough and is achieved by 
typing a simple command. Providing the system support for 
the practice of resuming a saved process at a much later time _ 
may await further system research and development. This 
is because such practice implies that the ''state'' of the Multics 
supervisor, and the Multics library will, at the time an old © 
process is resumed, be sufficiently like its original state to 
make resumption of the process meaningful. But how can we 
be sure that the option to resume the process at some later 
time can be successfully exercised? This problem is intrinsic 
to all information utilities whose supervisory code and system 
libraries evolve at some finite rate. Presumably a process 
will be stopped and saved ina state where its execution point > 
lies outside of any supervisory procedure or library code. 

If supervisory or system library code has been altered in the 
interim, resumption of the process at a later time can be 
"effective'' if and only if re-execution of said altered system 
codes is (Case a) not required, is (Case b) functionally similar, 
i.e., the altered procedures retain their original interface 
(e.g., same argument list, external references, etc.), or 
(Case c) old copies of the altered code can be retained for use | 
at process resumption time. 


Clearly, Case ais purely a lucky circumstance. Asa general solution, 
Case c implies serious and perhaps insurmountable problems in system design 
and system management. This approach requires that in the limit it would be 
necessary to furnish a user with a complete copy of an older Multics system to 
resume a saved process. | | 


Only Case b implies serious promise for a geieral solution, When systems 
like Multics or their successors reach a sufficiently stable state of development, 
it is not inconceivable that a contract between system administrator and a sub- 
scriber will imply a commitment to provide (over a certain time span) a stable 
| functional interface to the supervisor and other system-supplied code, le., 


sufficiently stable to provide effective resumption of long-saved processes, 


7-22 


SS 


(Needless to say, we are "not there yet", either in Multics or in any other system 
of like objectives. ) | | 
& - One remark is appropriate after considering this example. The TC is not 
only designed to assist in the stopping of a process, but is also able thereafter to 
_ recognize such a process, by marking the execution state (in its APT entry) as 
stopped. We see then that there are in actuality a total of five execution states 
that are recognized: 
running, ready, waiting, blocked, and stopped. 
A process is marked "stopped" as explained below, when it has no use for 
a processor and has no expectation of needing one, i.e., is not expecting a wake- 


up. .Putting a process in the stopped state prevents later wakeups received from 


cooperating processes from accidentally restarting a quit process. 


Qe Stopping a process in the general subsystem case 


A special entry is provided in the TC which can be used by one process 


(A) to stop another process (B). The form of the call is: 
call stop (id of process B); | 


Of course, the call must be and is quite privileged. No user can be permitted to 


| @... it in an indiscriminate fashion or the entire system would quickly collapse. 

On the other hand, with proper safeguards, it would be very useful to grant such 
permission for user process A to stop (and possibly even then destroy) user pro- 
cess B, provided, however, A and B were related to one another in a meaningful 
way. For example, Multics provides a subsystem with the capability for one 
process to spawn one or more other processes (much as certain system processes 
must be capable of doing), Each such spawned user process would then belong © 
to the same ''process tree'! (as those that have a common ancestor user process). 
In a fully implemented version of Multics a supervisory module, such as the Tc, - 
would conceivably be able to recognize members of the same process tree™ so as 


to screen requests of the form: 


call stop (B); 


* The coding scheme that will permit this recognition is not yet finalized as of 
this writing. 7 a | 


A subsystem ecient will, ‘therefore, be able to write code which rnakes 
an i initial working process A spawn processes B, C,..., etc. Any of these may > 
be coded to spawn others. All belong to the same process tree. Any one process. 


might reach a decision to stop another in the same tree on the basis of "cross @ 


talk'', i.e., interprocess communication between or among two or more of those 

: coexisting within the group. We leave to the imagination of the reader the possi- | 
bilities for subsystem design that are implied by virtue of these capabilities. 
Further consideration of this topic here would be pr emature prior to an examina- 
tion of the Multics interprocess communication facility (IPC) itself, which is 


introduced in Section 7. 5. 


7,3 CORE RESOURCES EMPLOYED AND MANAGED BY AN ACTIVE PROCESS | 


| _ In this section we shall examine core requirementa of an active Multics 
process during various phases of its existence (from the time the process is 
created until it is destroyed). We will also consider the system implications | 
of these core requirements in the multi-process environment in which all co- 
existing processes compete or tend to compete for core, (When a uger legg in; 
“a process is created on his behalf by a pre-existing system process which re- 
| sponds to the login command. The newly created process is registered in the 
Active Process Table (APT) and then given active status by loading into core 
the page tables for a small group of key segments.* The process normally re- | 


mains active until the user logs out, at which time the process is destroyed. t) 


The executian atate ata process not only characterizes the process ag 
a competitor for a processor, but it also suggests ameiaee how a process 


functions as a competitor for core. 


A running process will attempt to and in fact may capture as much core a6 


it needs. It will be restricted from doing so while it is executing only by virtue 


* The initial Multics implementation rigidly couples activation and deactivation 
of a process with its creation and destruction. A more flexible connection, 
e.g., dynamic activation, is also possible, and in principle, could be added 
to the System at some later time. Dynamic activation would make it convenient 
for Multics to support ee ae which exhibit a large number of onty occasion- 
ally- used processes. | 


BE e 


+A user's process can spawn other processes, which become Nactivel in “the | 
same way. | 


of competing demands of processes which are simultaneously executing on other 
@~ processors. Ina single-processor environment, the longer a process is allowed 


—™ to execute without interruption, the larger can its ''core holdings'' become. 


A ready process competes for a processor mainly with other ready processes. 
At any given time, a ready process will be queued in some fashion dependent, | 
among other things, on the respective priority levels of the set of ready processes. 2 
The specific queuing discipline is discussed in Section 7.4. As a competitor for. | 
| core, a ready process is a"loser'. Because it is not executing, a ready process. 
is unable to initiate the acquisition of core. Attrition can occur in its core holding 
due to the demands made by the executing process(es). A new page is brought 
into core for an executing process at the "expense'' of some other page. The 
-Multics algorithm for selecting pages to be ''thrown out" in favor of new ones, 
is such that least recently referenced pages tend to be preferentially selected 
for removal. Thus, in principle, the longer a process remains in the ready 
state (hence, not making references to its pages, to the core-resident portion 
of its address space) the more likely it will suffer a loss of pages. In actual 
practice, a small group of processes will at all times belong to the class of so- 
‘called "eligible processes'', At any given time a small number, n, of active 


processes are allowed by the Traffic Controller to compete for a processor. 


These are the eligible processes. The number n is recomputed periodically 

by the Traffic Controller, nis a measure of the capacity of the system to ef- 
fectively serve its clients short of an overload. An eligible process will normally 
gain sufficiently frequent use of a processor, such that its most recently refer- 
enced pages will not be purged during any period of its residence in the ready 


state. 


A waiting process is one that cannot make immediate use of @ processor 
because it is waiting for a so-called system event to happen, for example, the — | 
arrival of a page into core, the request for which was initiated while this process 
was last executing. As é competitor for core, a waiting process is a loser in 

roughly the same sense as a ready process. However, because a process that 
goes to the wait state for a system event is expected to remain there for only a 
brief and predictable period of time, it is allowed to retain its eligibility during 
this period. It also retains its favorable position in the queue, If sufficiently 
favorable, the readied process may in fact pre-empt the processor. In general, 


residence in the ready state will tend to be for short periods, hence short periods 
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between execution states and hence minimal attrition of its core holdings. 


A blocked process is waiting for a (process) event whose time of Varrival" 
is not in general predictable. As a competitor for core, all we can say is: the © @ 
longer the process remains blocked, the more of its (non- shared) core-resident 
pages will be removed. Ifa process is blocked long enough, all its unshared 
pages, including its descriptor segments, will be paged out. Page tables for | 
‘most of these segments will also be deleted. Care is taken, however, in the 
system design to retain the page tables for several critically important segments © 
(such as for the process state segment (PDS) and descriptor segments)*. By 
retaining page tables for these important segments, the process does not have 
to thrash about, taking an undue number of segment and page faults when it next 
re-enters the running state. Page tables and pointers to these segments and to. 
all other segments which remain active are retained in a special wired-down 
table area known as the System Segment Table (SST)t. Of course, there will 
always also remain a ''core residue" consisting of pages and page tables of the 
shared supervisory segments and perhaps also some shared library segments 


(both wired-down and otherwise). 


A stopped process is the same type of core competitor as is a blocked 
proce ss. The only difference is that a stopped process may be destroyed at 
the r equest of another process. Tt In the course of being destroyed, segments | & 
that are categorized as temporary (and filed in its ‘process directory’ '), are | 
deleted along with their file branches. These include the KST, the various in- 
dividual and combined linkage segments of the process, etc. The space occupied 
by these files, on whatever device they happen to occupy, is returned to the free 
storage pool of that storage device by the storage allocating routines of the sys- 
tem. Thus, core space occupied by pages (and page tables) for temporary § seg- 
ments is immediately reclassified as free. (Free space is dispensed first in 


satisfying page requests of executing processes. Only after this pool is exhausted 


* Strictly speaking, even these pase tables would be removed if the process were deactivated. 
The decision to deactivate (i. e., tnload) a process would: be made by a system controll process 
on behalf ofa process that has ben blocked for a lengthy period. A deactivated process is still 
'yemembered' in a process table of the system so that it can be reactivated when necessary. 


+ The detailed architecture of the SST is given in BG.2. 


++Morcover, a stopped process muy be deactivated. 


will other pages be removed.) Blocks occupied by the remaining pages and page 
tables of the destroyed process will be reused as needed by the system's paging 


algorithm, * 


From the above discussions we see that, as a process cycles through its 
execution states, there also is a marked tendency for its core holdings to ebb 
and flow cyclically. It is to the user's advantage that in the typical cycle the 
pages lost in going from the ''crest'' reached during its current execution state 
to the ''trough"' reached during its next wait or block state, should not include 
those which will be needed during some reasonable interval after the process 
next re-enters the running state. There are two obvious reasons: 

1. Increasing the number of page faults taken by a process will per- 


force increase the total elapsed time from the beginning to the — 
end of a process, since the time spent in the wait state increases. 


2. The processing of each page fault adds several milliseconds to 
execution time and this time is (and probably must be) charged 
to the faulting process. 
The system's scheduling and page removal algorithms are designed to 

help keep this potential loss of efficiency from becoming severe. As previously 
mentioned, the Traffic Controller limits the number of users that may compete | 
for a CPU at any one time, The eligibility restriction has the effect of keeping 
the paging activity in the system at a tolerably low level (asa percent of CPU 
usage), The by-product effect of this restriction is that the average number of 
pages allotted to each competing process can be kept above some desirable mini- 
mum value, If the number of eligible processes is made too low, however, _ 
eligible processes may compute efficiently (i.e., for very long periods between 
page faults), but there will be too much idle time when the page faults do occur, 
while at the same time the system's response to the non-eligibles may become 
unacceptably poor. As experience in the use of Multics grows, the sophistication 
of the various controls employed by the Traffic Controller can be expected to | 
increase in the resultant direction of an optimal balance among these conflicting 
needs. | | | | 
* An excellent view of this algorithm can be found in the paper, "A Paging 


Experiment in the Multics System,'' by F. J. Corbato, Multics repository 
document M0104, | 


bs 3e2 - Minimum Core Requirements of an Active Process 


We have already alluded to the idea that there exists in Multics a minimum 
core commitment for each active process. In this section we will begin to gain 
insight into the core management aspects of Multics that relate to these memory- 


resource requirements of an active process. 
From an overall view, core can be viewed as composed of three parts: 


(a) wired-down supervisor ede (very roughly 25K words as of the 
summer of 1969), | 


(b) | system-wide tables and I/O buffers (perhaps 50K words"), and 


(c) the remainder, consisting of core blocks of a pool that is managed 

by a wired-down core allocator. These blocks are for the pages 

of non-wired procedure and data segments (approximately 300K 

words maximum, in the present configuration at Project MAC). 
Of the system-wide tables mentioned in (b), the SST (System Segment Table) is 
of chief importance in this discussion, This is a table that includes an entry for. 
every active segment in the system and cross-references each of these segments 
with the processes which presently share them. For each active segment there 
is also included in the SST its corresponding page table. Saying that a segment 
is active, therefore, mainly reflects the fact that its page table is currently in 


core, Segments are limited to 64 pages (1024 words each),+ so their page 


tables are stored as 64-word blocks. ++ 


* Space allotted in these tables is a function of the total core space available and of the 
anticipated number of active processes permitted in the system at any one time. 


+ The SST provides page table space for enough page tables (npt) to ensure that there is, in 
fact, always an excess of page tables overt the number of segments (nseg) having a page or 
more incore, The excess, npt - nseg, is used to retain page tables for vital segments of all 
active processes (including those not eligible), and also to retain page tables for segments 
that are the ancestor directories of all active segments. Experience shows that keeping page 
tables for these segments in core drastically depresses the number of segment faults (and by- 
product page faults) that are normally incurred as a result of process switching and as a re- 
sult of file system operations. This amounts to a form of "preventive maintenance" for system 
efficiency, since each segment fault currently costs around 16 ms, and each page fault around 
6ms. Clearly, reducing the nurnber of these faults also means reduced delay or latency ir. 
the execution of any one task, 


ttIn Chapter 1 of this Guide, we mentioned that the GE 645 hardware was capable of support- 
p 3 Pp 


ing segments of up to 2°° or 256K words. However, software considerations have recently 
resulted in a decision to limit segments to 64K, 
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In terms of the overall view of core just presented, we now discuss two 
types of minimum core requirements for an active process. | 


be A static minimum, which is the core committed to the active — 
process while it is not running. This is the core needed for an _ 
effective transition into the running state. As long as the process 
remains active (1.¢., has an entry in the APT), the static mini- 
mum is retained. In the present implementations (Summer, 1969) 
it is a core space of approximately 230 words and, apart from the 
16-word entry inthe APT, is drawn entirely from the SST. 


Bas A dynamic minimum, which is the core needed when the process 
is running. Lhe additional core space implied in this minimum, 
while partly drawn from the SST, is mainly drawn from the general! 
pool of 1024-word core blocks. Approximately six blocks are cur- | 
rently needed for pages of several "vital'' system-provided segments. 
It will be useful to enumerate components that make up each of these mini- 
mum ''sets'' even though continuing implementation improvements may make 


these details rapidly obsolete. Table 7-1 lists the static set and Table 7-2 lists 


the dynamic set, The following discussion provides some functional explanation 


of these two sets. It is not intended as a complete discussion, only as the begin- ~ 


| ning of a plausible explanation for the curious. 


First we shall suggest the reason for maintaining, in active status, the 
three listed segments of Table 7-1. The system is designed so that resumption _ 
of the running state will result in references to each of these segments, and | 
hence segment faults to them either should or must be avoided. At least two of 


the listed segments will almost immediately be referenced when the process re- 


-gumes the running state. These are: the process state (data) segment, <pds>, 


and the ring-0 descriptor segment. (In the current implementation, the process 
state segment sncludes data areas previously allotted to various ring-0 segments, 
e.g., <pdf>, <rtn_stk>, <stack_00> and <kst >. In making references to the 


first two of these, neither segment faults nor page faults to certain pages therein 


eeceneeseertetange A LL CCL 


* The process executed last while in the Traffic Controller (ring 0). It was using page 1 of the 
ring-0 descriptor segment and a special stack (called the "process concealed stack'') that is kept 
in the process state segment. This segment also contains vital process state information such 
as the current ring number, which must be accessible to the system while it is processing page 
faults. | 


Upon returning from the block (or wait) entry of the Traffic Controller, the process will still 
be in ring 0, so the ring -0O Stack, also embedded in the process state segment, will also be 
needed. | 


TABLE 7-1 


7 | The Static Set | o 
(Minimum core requirement while an Active Process is not running) 


' Item . -2 Number of Words 
| = ae APT entry | | ars 16 
tf sie ¢ SST entries for segments: 


(72 words/entry, including 
a 64-word page table) 


a. — ring-0 descriptor segment <desc 0 > FY 


b. "process state segment"! <pds> 
(combines in one segment 
the functions of 
<pds> 
< pdf > 
<stack 00 > 


<rtn stk >, *and 


<kst>) | _— 72 
C. ring-i descriptor segment <desc 1 29 72 
Total — 232 words 


* Discussion in earlier chapters of this Guide mentioned these components as 
separate segments. Condensation into one segment resulted from op eee tes 
tion improvement in the Suramer of 1969. 


TABLE 7-2 


The Dynamic Set 


_ | (Minimum core requirement while an Active Process is Running) — 
Item 3 _ ss Number of Words 


1. Items listed in Table 7-1 a ots 2 
(3 SST entries and 1 APT entry) | 7 te wae 252 
2, SST entries for segments: 


(72 words/entry, including 
a page table) — 


process directory (ring 0) | ; | | Sen kG 
ring-istack ~ 7 : 7 ae. ee 

ring-i combined linkage segment — oe | a 72 

“Subtotal . <=. 448 

cr Pages for Segments 
(a) ring-0 descriptor segment - | ‘¢) an a 1024 
| - (b) ‘process state segment | eS : 

@  _ (see Table 7-1) Baye 2048 

(c) -ring-i descriptor segment . | (1)* | 1024 

ring-i combined linkage segment (1) | 1024 

ring-istack _ ae — i, on ——- 1024 

Total — 6600 

(in round numbers) | 


 * These pages are preloaded before the process begins to run. All other pages 

listed here are brought in as a result of page faults. Pages of the process state 
segment and the first page of the ring-0 descriptor segment, once loaded, are © 
treated as wired-down so lang as the process is in the running state. 


can be tolerated, so actual pages for these segrnents will be read (back) into 


memory, as necessary, before the process is switched to the running state. 


Typically, the Traffic Controller was entered either as a result of a 
system interrupt, while the process was executing in some ring other than 0,or 
indirectly, asa result of a call from another ring-i. The Gatekeeper must be able 
to effect a return to ring i, implying need for the presence of the page table to. 


| (and a page fron) ring i's descriptor segment. 


Once a process enters the running state, the pages of its core holdings 
(beyond those (four) that are preloaded for it) will rapidly expand in number. 
Segment faults will be incurred in referencing other segments such as those 


listed in Table 7-2; item 2. 


A segment fault results in the creation of an SST entry (72 words) and a 
page request for the referenced page. Ina "busy'' system new SST entries can 
; be created only at the expense of old ones which are in some sense candidates 
for removal. Just as new pages replace ("old") pages that have not recently 
been referenced, new SST entries replace those for segments which have no 
pages remaining in core. (Of course, certain types of SST entries, such as 
the per-process segments listed in Table 7-1, are not candidates for removal 


while the process is active. ) 


As shag in Table 7-2, a process ri tie down a minimum of six SST 
entries and a minimum of six pages while it performs even the simplest of tasks. 
Additional core space is needed for its non- supervisory procedures and data seg- 
ments, Normally, the pages and SST entries in Table 7-2 will be referenced so © 
frequently that the system removal algorithms will never select them as candidates 
for removal while the process is eligible. Presumably, the same will be true 


for frequently-referred-to pages of the user-provided segments of a process. 
The Working Set 


Following Denning* we shall refer to the Table 7-2 list, plus the other 


pages and page tables of the user segments that are being referred to very 


* Peter J. Denning, 'The Working Set Model for Program Behavior}! Communi- 
cations of the ACM, May, 1968, Vol. 11, No. 5, pp. 323-333. Also, ''Resource 
Allocation in Multiprocess Computer Syatem: ' May, 1968, PENS Oe MAC Techni- 
cal peor: 50 (Ph.D. Thesis of P. J. cae ca | 


frequently, as the working set of a process, The page removal and SST-entry 
removal algorithms of Multics are expected to "honor" the working set in the 


sense that its components tend to remain in core over the period of time the 


process is executing. During this time, demands by the same or coexisting 
processes for a large number of less-frequently-needed pages and their SST 
entries can also be satisfied without seriously affecting the working set (or work- | 


ing sets of the eligible processes), 
edie The System Segment Table and Shared Segments 


This subsection and section 7.3.3 describe some of the inner workings of 
the file system's key modules and data bases used in creating and managing the 
Multics virtual memory. The material is provided mainly for the sake of com- 
pleteness.* It is certainly not essential in the flow of ideas for this chapter. 
These subsections do, however, help one to appreciate some of the challenges 
that have faced the Multics System designers and show why the success of some : 
subsystems may well depend on how successfully its designer has minimized the 


load (segment and page faults) that the subsystem has placed on the file system. 


Since space for entries inthe SST is limited, it will be the usual case that 
some segment must be deactivated so that another may be activated. Deactivating 
© -asegment, therefore, means relinquishing SST table space for its SST entry. 
| Each such entry consists of an 8-word ''Active Segment Table'' (AST) entry and 
its associated 64-word page table. An active segment becomes a candidate for 


deactivation if it satisfies these conditions: 


l. The "wired- aswul switch in its AST Saree must ae OFF. 

22. If this is an AST entry for a Saieectouy, ips inferior eoune. tease *. 
the number of its immediate eeeernecots which are active, must 
be zero, 


“Among all those entries that satisfy the above conditions, select the segment 


whose page table shows the fewest pages now in core. 


* A more complete discussion (very readable and interesting) of this topic may 

be found in the paper, ''The Multics Virtual Memory"! by Bensoussan, A., | 
Clingen, C, J., and Daley, R. C., Second ACM Symposium on Operating Sy stem 
Principles, Princeton University, Princeton, New Jersey, October 21], 1969. 


i 


' When the entry for such a segment is selected for deactivation, the corres- 


ponding SDW for the process which lists this segment in its address space must 


be located and marked appropriately with segment-missing bits. (This process @ 
is termed in some of the Multics literature disconnecting a segment.) In this | 

way one is assured that a subsequent reference to this segment will incur a seg- 

ment fault and thereby invoke mechanisms to recreate an AST entry and page 

table. Having put a "stop" in the appropriate SDW, Segment Control is then free 

to proceed with the construction of the new AST entry and page table, and (call 

Page Control to) page-in the referenced page. The faulting SDW word is then 

altered appropriately and made to point to the newly constructed page table. 


(This process is termed connecting a segment.) 


If we consider that the old segment may have been shared, then removal 


of its AST entry and page table implies the alteration of an SDW in the descriptor 


| Recall that SST cross-references every active segment with the processes 

that share it. Segment Control, by proper use of the SST, is therefore able and 

is sufficiently privileged to identify each process that currently shares any given 

segment and additionally determine its segment number in each of these processes 

(i,e., determine the segment pointer in each sharing process). To be more pre-_ | ® 
cise, the cross-referencing design of the SST permits Segment Control to get at | | 
and alter SDW's in each descriptor segment of every process that shares the 


segment that is being deactivated. 


_ Suppose we picture that process A is deactivating a segment <s> that is 
shared by processes B, C, D, etc. Clearly, the job of marking with segment 
faults a sizeable number of SDW's (i. e., disconnecting a sizeable number of 
segments) cannot be done instantaneously. This means that some care must be 
(2nd is) taken to prevent page-faulting references to <s> from being serviced 
by processes B, or C, etc., while A, still executing in Segment Control, is 
attempting to deactivate <s>. To permit the freedom for Bor C, etc., to re- 


quest a new page in <s> at this time is to invite chaos,” Process A prevents 


“ This is because such references, if permitted, would imply that Page Control 
would be working at cross purposes for different processes, For one process 
it would be attempting to add pages for <S> (and remove page faults in the page 
table words of <s>'s page table), while for another process it would be attempt- 
ing to reset the page table so it can be usec for another segment. — 
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this confusion from happening by setting certain flags and locks at key places in 
the SST at the start of its deactivation task. The result is that no other process 
sharing <s> will then be permitted to do more than idle ina loop, should it fault 
toa page of <s> while <s> is being deactivated. When fully completed, and all 
SDW's have been set with segment faults, the flags and locks are reset, permit- | 
ting the sharing processes once again to reference <s>. The first such reference 


will incur a segment fault that will then result in a new activation of <s> More 


, 


fd 
o~ 


details for those interested are provided in the accompanying footnote.’ 
ee eee Handling of Segment and Page Faults 


Segment faults and page faults are relatively costly in execution time and 
_ wherever possible the subsystem designer should be on the lookout for ways to | 
avoid triggering these faults unnecessarily. Befote examining ways to avoid 
these faults (see Section 7.3.4) itisa good idea first to get a feeling for how | 


these faults are handled. 


7.3.3.1 Segment faults currently require on the order of 15 to 20 ms of cpu 
processing time, In addition, segment faults will usually trigger page faults 
and possibly other segment faults as a by-product since both the segment fault 
handling procedure and the data bases it looks at are not wired down, The steps 
taken are roughly as given below, For purposes of illustration we shall picture 
that the fault is taken for a segment <t>in ring i. Here we imagine the branch 


for <t> is found in a user directory whose path name is > w dir dir>John, 


l. Consult the KST entry whose index is the same as the segment 
number of<ct» (t# is determined from the saved machine condi- 
tions.) Since the KST is itself always active, there will be no 
segment fault incurred in referencing it, but since the KST is 
a paged segment, a page fault may be induced before a reference 
to the desired KST entry is eventually achieved. Segment Control 
will obtain from the KST entry the segment number and offset 
of the directory branch for <t> in the segment <John >, 


* If another process takes a page fault in the segment being deactivated, Page Control will notice 
the set flags and will properly interpret what is happening. It will then "tinker'' with the process 
so that it will believe it has taken a segment fault instead of a page fault. This is accomplished 
by altering the SDW of the segment that has incurred the page fault so it will cause a segment 
fault when next referenced, Following this adjustment Page Control simply returns, whereupon 
a repeated execution of the faulting instruction will cause a segment fault. When the process 
next attempts to solve its segment fault problem, it will be forced to wait on a lock (of the parent 
directory) which has been set and will remain set until the process doing the deactivating of the 
segment has finished its job. Whereupon, the flags and iocks are reset, permitting other pro- 
cesses to again activate the deactivated segment if necessary. 


Li Appropriate information needed for activating <t> is then copied 
from ite branch if <t> is not already active.* In referencing the 
branch, another segment fault may be induced if <John>, although 
guaranteed to be active (by the ancestor-is-always-active rule), 
is not connected to its page table, i.e., has fault bits set in its 

| SDW.? However, the particular page wanted from < John> may 
be missing, thus inducing a page fault. The data copied from the 
branch is used to create a new AST entry and page table. Page 
table words (PTW's) are set with missing page faults and file map 
data. That is, the address field of the PTW is either set with 
pointer information for the page's address in auxiliary storage 

or is set to null (for page numbers that are either beyond the cur- 
rent length of the segment or that COPECe DONG to pages yet to be 
created). 


3. After the page table is constructed and the new AST entry is cross- 
referenced to the process requesting it, the SDW words are ap- 
propriately set in the ring 0 and ring i descriptor segments, point- 
ing to the new page table. 

(436,342 Page faults currently require on the order of 3 to 7 ms of cpu process- 
ing tirne. We have already suggested the type of tasks that are involved in hand- 
ling page faults in earlier discussions, so we will not go into much more detail 
on the matter here. Briefly, when Page Control is called to get a page, itis — 
handed, via the faulting machine conditions, a pointer to the faulting page table 
word (PTW). Moreover, since the page table address in which the PTW is found 
has the same index as the corresponding AST entry, the latter's address in the 
SST can also be determined. The AST entry would then be consulted to ascertain 
if it is o.k, to proceed with the fetching (or creation) of the page. (You may re- 
call, this entry may have been flagged to indicate the segment is in the process 

of being deactivated.) If the PTW address field is non-null, it contains device 
address information for the wanted page, but if null, it indicates that a page of 


-zero-valued words is wanted.t++ Page Control calls on a core-allocating 


= Strictly speaking, all references to branches are made by Directory Control on behalf of Seg- 
ment Control. | | | | | 7 
+ In fact, a recursive sequence of such segment faults can occur in the unlikely event that all 


parents (except the root) have fault-inducing SDW's (i.e., are disconnected). The recursion 
ends at the root node because this item is by design always immediately accessible. 


++Space for such "'empty'' pages is created only when first reference is made to them, It is 
never created and stored ahead of time. Hence, no page needs to be "transferred" from secon- 


dary storage. 


routine* to obtain the address of a free core block. (Such a request can easily 


- trigger a page-removal request if no free core blocks are immediately avail- 


able.) t Page Control zeroes out this acquired block, resets the faulting PTW 
to point to the new block, adjusts the AST entry to reflect the new condition of | 


the page table, and returns control to the faulting procedure. 


If the pointer in the PTW is not null, Page Control again asks for a core 


‘block address, initiates the drum or disc 1/O request as appropriate to get the 


~ wanted page from secondary storage, and performs its Good Samaritan chores 


(notifies) as described in Section 7. 1, and calls wait in the Traffic Controller. 
You can see from the foregoing discussion that processing time for a page fault 
will vary according to several factors. Certainly, the time to create a new page 
of zeros rather than to bring one in will be short, i. e., of the order of one ~ 
millisecond (which is about the time required to write a thousand words of 


zeros). Processing of several milliseconds will be needed in the more compli- 


cated cases, where the page removal algorithm must be invoked and an 1/O 


request initiated. Of course, this does not count the actual time spent in the 


page wait and in the (possibly) subsequent ready states. 


7.3.4 Ways to Reduce Segment Faults 


By now the subsy stem designer reading this chapter should be more than | 
mildly receptive to suggestions for reducing the incidence of segment faults 


(that are under his control to reduce). Two relatively obvious principles serve 


as a guide. 


1. Because other eligible processes compete for page table space 

in the SST, a process having a large number of segments will 

tend to suffer a larger number of segment faults than a process 
with fewer segments. Hence, a conscious effort to keep the 


number of segments to a minimum will tend to reduce segment 
faults. | | | 


LD 


. Additional details on Core Control and its interaction with page control may be found in BG.5 
and BG. 6. Page Control itself is described in BG. 4. 


To remove a page involves invoking the page-removal algorithm about which we spoke 
earlier, which ''fingers'' pages that are candidates for removal. Also invoked would be the 
appropriate machinery to copy out the contents of said removal candidates on to auxiliary 
storage. Copying of a page is performed only if the respective PTW indicates, via its page- 
has-been-written bit, that the page has been altered while in core. 


os For a given number of segments in a process it should be possible, 
by conscious programming effort, to organize a ''computation" for 
a minimum of segment faults. Intuitively, this could be achieved 
if it were possible to sustain a high enough frequency of reference 
to the segments that were most recently referenced. In other words, 
if it is possible to design the process so that it maintains a high 
degree of locality” with respect to its segment references, the 
process will incur fewer segment faults. : 


Here are several approaches the subsystem designer can take to reduce or 


limit the number of segments of a process: 


(a) Avoid specifying multiple rings unless necessary because each 

| ' ying in which the subsystem executes automatically adds a number 
of segments to the process, €.g.; a descriptor segment, a stack 
segment, a combined linkage segment, and, when used, a signals 
segment and its special linkage segment. | | 


(b)  Bindt procedure segments that belong to the same ring and bind 
where feasible, data segments of the same ring that are to have 
the same access controls. 


(c) Proliferation of procedure segments can be limited by a conscious 
effort to define internal functions and procedures, i.e., those that 
are defined within the body of external functions. Algol, PL/I and 
MAD provide good facilities for defining internal functions. 
FORTRAN does not. 7 


(d) Use internal static and automatic variables (i, e., that will be | 
placed in the ring-i stack or (combined) linkage segment) whenever 
possible, in deference to creating separate (external) segments 
for variables. : 


Here are several approaches the subsystem designer can take to increase 
the degree of locality of his process: | 
L, If the subsystem is multi-ringed, try to confine the computation 


within one user ring (or within as few rings as possible) for as 
long as possible. | 3 


os. Avoid frequent use of loops in which there are explicit calls to 
the ring-O supervisor. 


* A thorough discussion of the locality concept is given by P. J. Denning ina 
paper entitled ''Thrashing, Its Causes and Prevention", Proceedings of the 1968° 
Fall Joint Computer Conference, Vol. 33, Part 1, pp. 915-922. 


+ Refer to BX. 14 for details of binding. 


or Avoid signaling across rings entirely, or réduce the frequency 
of such signaling, i.e., avoid executing calls to < signal >that 
incur searches across ring boundaries for the active handler. 
(This is a reference to discussions in Chapter 5, If you have 
not read those details, pay no attention to this remark. ) 


72325 Wa s to Reduce Page Faults | _ | a > 


Alas, there are no revelational remarks that can be made on this subject! 


Following the logic in preceding discussion on segments, it is clear enough that — 


a process with fewer pages will, inthe long run, generate fewer page faults. 


What is really wanted is the recipe for minimizing page faults when the number 


of pages in a process is already at its minimum. A high degree of locality (within | 


- the pages of) each segment is wanted and this is a property which only the individ- 


ual programmer can, with his conscious effort, attempt to achieve. In some 


cases this will be easy and in other cases very difficult. 


Source code for a procedure can be examined (or re-examined) in search 


of ways to regroup sections of code, especially loops, so that execution is 


"resident'' within the fewest pages for the longest period. To do this it may 


require that the programmer identify points in the source code that correspond 


to page breaks'' in the target code. If there were a high enough payoff for this 


type of activity, compilers might be coded to optionally print page-break markers 
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tion of the target code in most cases. (I know of no compilers now operating in 


Multics that provide this service, ) Unfortunately, any gains from such attention 


to page-breaks could easily be lost if the segment is bound with others in an 


effort to reduce segment faults. 


“Hee ASSIGNMENT OF PROCESSOR RESOURCES 


| The assignment of processors to processes that need them is a central 
management problem in any information utility. In Multics, two management 
functions are, in fact, to be simultaneously fulfilled in the second-by-second, | 


minute-by-minute solution to this assignment problem. These are time sharing 


decision-making and multi- programming decision-making. Both functions quite 


naturally influence the kind of system response that can be expected in the execu- 
tion of subsystems developed by users. Introductory details, hopefully adequate, 


for the needs of the subsystem designer, are provided in this section, = 


* The principal MSPM references are Sections BJ. 5, BJ.6and BJ.7. 


The time-sharing function guarantees each user a chance to gain an equit- 
able share of processor time. The term "equitable" is explained in the next 
subsection. Roughly speaking, sharing is always among those users that ae, 
the same priority. For reasons we shall see shortly, a user's priority needs 
may well vary with time. An ideal time-sharing system should therefore antici- 
pate (or correctly guess) each user's current need for processor time and accord 


him a (higher or lower) level of priority that is consistent with this need. 


Principally because of core memory limitations, however, a processor 
cannot be effectively time-shared with an unlimited number of equal-priority 
processes. The multiprogramming function is therefore restricted so that the 
processor is assigned in fact toa sufficiently small subset of the ''most deserv- 
ing''. The subset, called the eligibles, is chosen small enough so nat the work 
that is done for each member is effective work, and is not degraded, for instance, 
by thrashing. An implication of the 'subset-of-eligibles'' idea 1s that the proces- 
sor may occasionally be forced to idle for very brief periods when and if all | | 
eligible members happen into page-wait or other system wait status at one time. 
(Occasional idleness is preferable to the alternative of adding another process. 
to the list being multiprogrammed. The latter approach would greatly increase 
the risk of thrashing. If this occurred, the cost of recovery would prove greater 


than the small abount of deliberate idle time.) 
7.4.1, Time-Sharing Philosophy 


To understand the basis for the scheduling algorithm us ed in Multics, one 

3 begins with a crude model in which every user is an 'tinteractive user"; his 
process consists of executing a series of normally short spurts of computation, 
e. Ses commands. If the execution time for each command were short, invariant, 
and known in advance, then a sensible scheduler might allot each user the time > 
q, that is needed to execute command k to completion. Users would be queued : 
on a single list and permitted to execute in FIFO fashion. No timer run-out 


interrupts would be required. 


A few commands may be fixed in duration, but, for most, their duration 
are functions of their arguments, such as the FORTRAN command whose argu- 
ment is the program being compiled. Nevertheless, it is useful to pursue this 
line of reasoning because it provides us with a useful conceptual basis from 


which to understand the more realistic scheduler used by Multics. 
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A Conceptual Model — 
Let q be some (albeit fuzzily defined) average duration for commands in the 

® system, Suppose current experience indicates that q, time units is an appropriate 
approximation to q. If the scheduler were initially to allot each user an amount of 
time og: then a sizeable fraction of users would complete their tasks before their 
time ''ran out''. We accept as a premise that the system should, in the default case, 
be designed to favor users who execute short commands by according such users a 
relatively high priority. Let each command be classified into one of n categories | 
(numbered 1, 2,..., n) according to its duration. Commands in category 1 would © 
respond to short durations with priority level 1 (highest priority), while lengthiest — 


commands in category n would be in priority levél n (lowest priority). 


Our next step is to associate with each priority level a separate queue. ‘If 
a user wishes to execute a command that falls into category j, where 1 <j <n, his 
process would then be added to the queue numbered j to wait his turn, Following © 
the management principle that higher- -priority jobs should be executed before. | 
inwes priority jobs, we promulgate the following default rule for the conceptual 


model scheduler to follow. 


No entry on queue numbered j shall be considered until all higher priority 


queues are empty. This rule discourages users from executing long-duration 


commands while the load is high. There is also presumed to be a mechanism 
for overriding this rule, so that under certain circumstances a user can request _ 
an increase in priority level for his command. (He might, for instance, be > 

- requested to attract the attention of the supervisor by pressing a special button 


on his console. This completes our first model. 
The real model (Multic s)_ 


Here we shall realistically presume that a command's duration is in general 


not known in advance. However, we shall take the attitude that the unknown dura- | 


tion must be determined if meaningful scheduling is to be achieved. Some type 
of adaptive technique suggests itself... In the Multics scheduler, it is assumed 
that every process arriving on the ready list for the first time deserves a posi- 
tion on a high priority queue on grounds that the command to be executed will be 
a short one. Associated with the queue is some fixed time allotment do (say one | 
second). When a process on this queue is picked to compete (in the eligible- 


for-multiprogramming sense), the command may run to completion. If so, the 
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process will then call block before the allotment do is used up. On the other 
hand, if the allotted time is exceeded, execution will be halted by a timer run- 
out mechanism, The command is then assumed to belong to the next category. 


The containing process can no longer compete on thé first queue, so the process 


is awarded an additional allotment of time, say 2x do? and placed at the end of 


mf next lower level priority queue. 


Each time the current time allotment is exceeded, execution is stopped SO 
that the command's category can be ''reappraised"’",. That is, the process is given 
an, additional allotment, say twice the preceding allotment (e.g., 2x (2x do) and 
placed at the end of the next lower priority level queue. In this fashion we see 
how the system learns adaptively about the "'true'' duration category, 2, of a 
command's activation. When command execution is completed at some priority 
level, & (or more strictly speaking, whenever an interaction is accomplished) 
the process is rescheduled, anticipating execution of another command. Specific- 
ally, the process is dissociated from queue and reassociated with the top | 


pron aty. level queue with allotment do: 


A price has been paid to learna ‘commands "true" duration category when | 


more than do time units are needed. This price amounts to premature process- 
ing of a variable portion of its total execution, During this time the process is 
allowed to compete briefly with more favored commands. Thus, if a command's 
execution is in the rante do 2° < duration < x 2°, then at least half of its 


Io 
processing (qo x 2” ) will be completed at a perhaps aca high priority. ° 
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‘The Multics pulaaes to the processor assignment problem is made 


1M nh 4 Ae 


eee re ne ee ee on eee Oe 7 . 
* If we assume there are cost vectors C and R such that c. is the charge to execute for q 
seconds at priority level i, and r. is the cost of rescheduling the process from priority level 
itoi+ 1, then the excess cost, EC, to learn the true category, s, of a command's particular 
activation is at most 

a s - 1 Pe ak 

_ i 

EC = > (c,-c.)x2 tr, 
| | r= 1 
Of course, a user would not necessarily be charged in this manner. EC is a quantity that is 
mainly of conceptual value in understanding the nature of the Multics scheduler. 


+ Modular design of the central supervisor has made it feasible for the Multics system architects 
to try many variations of the basic traffic control and scheduling algorithm whose concepts have 
been sketched earlier. The details presented here are, to the best of the writer's knowledge, 
consistent with a version of the scheduler used in the Spring of 1969 and are useful for illustrative 
purposes. Details of the actual scheduler may differ from those described here, but in all likeli- 
hood differences from those described here should be of no significance to the subsystem writer. 
System designers wishing to learn about the current algorithm are expected to consult the appro- 
ieee BJ sections of the MSPM. 


conceptually simple when it is discussed with the ready list as the focal point. 
| @ For then, albeit at risk of some oversimplification, one sees that: 
(1) ‘The decision process that determines where new entries will be 
inserted in the ready list and what time allotments to give them 


(scheduling) amounts to fulfilling the time-sharing decision 
function, | | 


(2) The decision process that determines which entries on the ready 
list to run amounts to fulfilling the multiprogramming decision 
function, (Removing an entry means awarding a PEOCeE sor to the 
process associated with that entry.) _ | 


bete aed Insertions in the ready list (scheduling) 


The ready list may be viewed as a set of n queues (each a ready list), one 
per pr iority level. At the time it is created, each user process is assigned a 
range of priority levels (2 his &) with initial execution started at level ak The 
renee (2,5 Q >) offers some clue as to the type of time-sharing service the pro- 
cess will be given. At any given time a ready process has a current priority 
level, 2, such that 1 <h, < sh, <n, For interactive and absentee user processes, 

the ranges ( Lys Ro) fall on the scale 1 to nas follows: 
Interactive processes would range from 3 to some value k, while. 
©& | absentee processes would range from some value k, to n-2, with 


k, having a lower priority level than k, as suggested by these 
, 1 
sfraddling brackets: | 


3r 
levels for | 
interactive a a k, 
| eas a k levels for 
lL 
| absentee 
processes 


n-2 


Figure 7-6 summarizes these ideas and reminds us that there exist (and 
indeed, that there must exist) priority levels both higher than the highest user 
priority level and lower than the owes user priority level. There is room "at 


the top'' for special system processes’ * that can, if caer Pree eee 


* One such process is the Loader Daemon System Process. This process is awakened by the 
Traffic Controller for the purpose of "loading'' a process that has been identified as ready and 
eligible to run on a processor. Loading consists of placing in core and wiring down any or all 
of the starred items of Table 7-2 that are not now in core. To be of any value, the Daemon 
should run immediately and briefly upon being awakened. For this reason, it must have its 
own vital segments and pages in core at all times. As soon as the loading is completed, the 
Daemon calls block and waits for another such wakeup. | | | 


(243° 


Remarks Eligibility 


a emmennmnonmenal 


Special system 


Level 1 This level is reserved for certain ‘Always 
processes system processes like the Loader eligible 
Daemon Process. They always ; & 
run in short bursts and it is essen- | 
tial that they can almost immedi- ! 
ately pre-empt a processor when | 
waked up. | 
, | t 
Level 2 Special syste Less critical system processes, 
: processes such as the 1/O driver, e.g., for | 
| the line printers, which should pre- 
empt overall user processes, and | 
the System Control Process (com- | 
monly referred to as the Nanswer- | 
ing service"). ; 
(Pius Seat oe aise eR eee eaey Bae ean ee eee ee ss Ok eee ates 
Level 3 Interactive | Top priority level for interactive 
users and sys- user. | 
tem processes 
Level kK) Interactive Lowest priority level for interac~ 


users, system 
and absentee 
processes 


tive users. 


e 8 e e | ‘ | 


Absentee Lowest priority level for absentee 
processes | users. | 


Level n Idle One idle process is earmarked for 
processes © Oe (exclusive use of) each processor 
| in the system. 


O 
po 
bie 
GQ 
fe 
on 
- 
@ 


| i 
| 1 
1 

7 
' 

a | | 

| Note: System processes in the range (3, k,) compete with 
user processes. _ | | 

; 
3 
1 

| 


Figure 7-6 The n priority level queues of the ready list 


capture a processor. Likewise, there is room ''at the bottom!"' for an Widle" 
process that can capture the processor should all ready queHee at higher priority 


levels ever become empty. 


Priority level 3 is reserved for processes that have an urgent, put very | 
brief, expected need for a processor. E.g., a process which has initiated a 
| ‘console read command, is willing to give up the processor (go blocked) while | 
waiting for the console typist to type the next input line, but wants the opportunity | 
to resume execution, i.e., to respond, just as soon as the input step has been 
completed. The console read subroutine is a privileged (ring-0) procedure which, 
after initiating 1/0 activity, calls the block entry of the Traffic Controller after 
| turning on a so-called "interaction" switch in the process state segment. Any 
call to block with this switch ON results in rescheduling the process for the 
highest user priority queue (level 3). | 


Lower priority levels would be reserved ee aentes processes. These 

| are processes which run without console assistance, i.€.; non- conversational 
with the external environment. Absentee processes are akin to foreground- | 
initiated background jobs in CTSS*, (The priority range for absentee processes 
is expected to be (k,, n-1) where 3 < <k, < k,- This would provide some straddling 
of the interactive user's priority range, which is (3, k 1). In this way overnight 


service on relatively short absentee jobs would in diner be 'guaranteed".) — 
ce 4, Zak "Eligibility Management | - design detaa's 


Eligibility management superimposes a needee control on the ages of 
processes, call it nep, that are being multiprogrammed. This control prevents | 
thrashing, since the number nep is, ina sense, an approximation of the maxi- | 
mum number of processes whose complete working sets can fit in the available 
core memory. There should be no substantive alteration to the efficacy of the 
general multi- level scheduling algorithm asa consequence of superimposing 
eligibility control. This subsection and the next one on pre-emption outline the 


design details. 


ae aa : : 
‘ For more information on absentee processes, see BQ.2 and BQ. 3. Actually, 
| Uae epee process<s are not yet provided for in the current Multics. 


The number, nep, which will eventually be a value that can be varied by | 
the system administrator, is a function of available core mernory and the number 
of available processors, At the present time, nep is fixed at 2, but may well be 
increased when 384K of core is used by the system. (There are also certain 
special processes that are allowed to and, indeed, must be eligible at all times. 
As mentioned earlier, these include, for instance, the Loader Daemon Process, 


and the one or more idle processes. ) 


| APT entries for eligible processes can be thought of as linked in a list 
such that their positions onthis list determine their relative priority to capture 
a processor. When a running (and eligible) process enters a wait state for the 
occurrence of a system event, the CPU will be given to the ready process on the 
eligibles list that has the highest relative (or positional) priority. When the 
event waited for occurs, the now running process notifies the waiting process 
by marking its APT entry accordingly (changing its state bit to ready). Ifthe 
process so notified has higher relative priority, the notifying process (executing 
in the supervisor, of course) immediately yields the CPU to the notified process. 
(We refer to this behavior as CEU pEcccmpie™ ) In this way the system favors | 
processes that become eligible, one at a time, giving each eligible process when 
it reaches the lead position the Breotest chance to run to completion of its time 


allotment. 


The number, nep, 1s peeated as a system resource that can be allocated 
among the active processes, somewhat as core is allocated. Processes gain 


and lose eligibility cyclically. The cycle can be traced as follows: 


An ineligible process is made eligible when it appears on the top of the 


ready list at the time an eligibility vacancy occurs. A vacancy will occur when 


a process loses its eligibility for any one of a number of reasons: 
(a) An eligible process enters the blocked or stopped state, 


(b) A process incurs a timer run-out interrupt and is rescheduled, 
or : 3 | | 

(c) A process! eligibility is pre-empted. (We explain eligibility 
pre-emption as distinct from CPU pre-emption, in the next 
subsection. ) | 

Since a process specifically retains its eligibility when it enters the wait 


state, it is entirely possible for all nep of the eligible processes to be in the 


wait state simultaneously. In this event, there being no eligible processes on 


He firs: n-1 ready queues, the processor is given to an idle BrOce an 


Each time a process gains or loses eligibility, is APT entry is maaeed: 
accordingly. When the process loses its eligibility the Traffic Controller selects 
the next candidate to be marked eligible. If that candidate is not loaded, the TC 
sends a wakeup to the fast- responding Loader Daemon process to load* that pro- 


ede and finally re-enters the blocked state. 

7. 4.2. 3 Pre-emption of Eligibility 

| ~The required conditions (for an ineligible process B to pre-empt the eligi- 
bility of a running process C are: 


(1) B's priority level exceeds that of C (i.e., B's queue number 
is lower than C's), ; | 


(2) The eligibility of a running process C will not be pre- empted 
unless it has executed at least as much of its present allotment 
as the higher-priority process B "intends" to run when it cap- 


tures the processor, Let r be B's time allotment and s the 
amount already used of C's time allotment. The condition to 


be satisfied is that r <s. 
If condition (1) is met, but condition (2) is not, process B must reside on 
the ready list until (to the nearest time unit) condition (2) is met. Notice that 
eligibility pre-emption is a necessary (though not sufficient) prerequisite for 


CPU pre-emption. 


For whatever reasona process is pre-empted (its CPU or its eligibility), 
that process must be immediately rescheduled. A pre-empted process that has 
a priority level k, is rescheduled by being placed on top of the queue at level k 
with a time allotment equal to whatever time is still unused from its last schedul- 


ing allotment. 


The net effects are as one would like them to be, namely chat a pre-empted 
process is favorably treated, relatively speaking. Thus, suppose B, at priority 
level 3, pre-empts C at level 4. C is rescheduled at the top of level 4. If, | 
shortly afterward, B incurs a timer run-out, it will lose its eligibility and be 


rescheduled at the bottom of the queue at level 4. Each time a process loses 


* Loading, you recall, involves placing and wiring into core the pages listed in 
Table 7-2. | 


= 


slain the Traffic Controller immediately tries to fill the created vacancy 
before identifying the next eligible process to be selected from the ready list. 

In this instance, the eligibility vacancy will be given to C, and C will be picked 
to run next, if no other process has been added to the ready list ahead of C while 
B was running. This, we should note, is precisely the behavior pattern we want. 
Namely, other things remaining the same, a pre- empted process should be | 
placed in a favored position to recapture the processor when the pre-empting 
process next loses its eligibility. Because its position is favorable, the pre-. 

| empted process is unlikely to lose a significant part of its working set. No ccre 
losses to speak of will be experienced if the user process is pre- empted by a 


high priority are process, since these processes do not ''consume significant © 


quantities of core, ! 
- 4, 3 Expected System Response 
| 


| With the benefit of the foregoing discussions on scheduling, eligibility | 
ccntrol, and pre-emption, a user is now ready to anticipate the type of system 


that can be expected for his process. We summarize these ideas here. 
During slack load periods, e.g.;, 2a.m., Sunday morning, users will be 
gatisfied with the system response to most commands. The ratio, R, is defined 


as 


elapsed time for completion ofa command — 


virtual time 


actual CPU 
processing time 
for the command 


will approach 1. 
During peak load periods, however, users who execute commands of long 
| duration are likely to observe that execution appears to proceed rapidly at first, 
then slower and slower, as suggested by the solid curve in Figure 7-7. Thus, 
commands that under light system loads might require three minutes of proces- 
sor time may require hours to complete in extreme cases. This is because 
the user's process sinks to lower and lower priority. levels as virtual execution 


proceeds. A point is likely to be reached where the process is rarely or never 
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7-7 Variation of R (the ratio elapsed time to virtual 
time) as a function of virtual time. 


Figure 


picked to execute simply because there are one or more processes that are _ ¢ 


always queued at higher priority levels. Only after the peak load subsides (fewer 


users logged in) will the long-duration command again have a good chance to be 


picked for execution, 
User Recourse to slow response 


| A user that is impatient with this response is encouraged to restructure 
his process to run in absentee fashion, Alternatively, ‘he can force one or more 
"interactions" each of which will have the effect of rescheduling his process 
(back up) to level 3. The simplest way to force an interaction is to press the 


"quit'' button on the console and then type 
| start (carriage return) 
ee the console responds to the quit signal by typing 
| ready. 


The effect of hitting the quit button is to cause the user's quit respondér to call 
the Listener to accept the user's next input. Since a console read-in always sets 
the interaction switch in <pds> to ON, there is a consequent rescheduling to 

level 3, The effect of using the quit button is suggested by the dashed line in 
Figure 7-7. Users will learn that the use of the quit button for purposes of 
speeding up execution will prove to be an unpleasant way to use the system. 

(It turns out to be not much fun hitting quit and typing "'start'' repeatedly, especi- : 


ally if it must be done a large number of times.) 


7.5 INTERPROCESS COMMUNICATION 
te Ded The Nature of Processes and the Nature of Their interoommunicat on 


In the overview of this chapter (Section 7.1), we initiated a discussion of 
interprocess communication--although without taking a serious look into the 
nature of their intercommunication. Here we provide a more thorough discussion 
of this topic. We shall alsc provide for interested readers a description of the 
tools available (and how to ise them), i.e., for the operation of subsystems that 


comprise two or more intercommunicating processes, 


We already know from previous discussions that processes may properly 


function to achieve a common goal only if they communicate as senders and as 
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receivers of messages via shared data bases. Hence, an under standing of com- 
munication mechanisms, e, Qe; information content of messages, "mailboxes, '! 
message switching and routing techniques, validation and protection of messages, 


© etc., is likely to be essential for detailed subsystem design. 


active | 1 | 2 3 


dormant 
suspended 


suspende suspended 


Figure 7-8 Time-line characteristics of sequential processes 


! 


A process A that wishes to alert a process B of an event of interest to the 
latter must send a wakeup and a message to B. The message must be formatted — 
in a standard fashion, so that its source (Process A) and its target (Process B) 
@ and the specific receiving point within the target's address space (1s Cag: a 
'channel'' or mailbox) may be recognized, validated, and accessed, The receiv- 
ing process must in turn exercise known scanning habits which will find, properly 
interpret, and dispose of messages that have been received in its mailboxes. 


Details are discussed below. 
7.5.2 ~ Communication Mechanisms 


A process may reach any number of suspension points, as suggested in. 
Figure 7-8 where, for example, three such points are marked. Inavery simple 
process, these suspension points, however many in number, may occur at the 
same program point, i.e., the process loops through this program point. For 
n transits of the loop, there will be n suspensions. In each case, then, the | 
nature of the event waited for will be the same, and even the sender of the notify- 
ing message that permits resumption of the active state may be the same. Ina 
more general case, however, we Can expect suspension points at different pro- 


gram points, e.g.; possibly occurring in different procedures, and even in 


different rings. Suspensions at different program points will, in general, be for 


different reasons, (Such program points will hereafter be referred to as wait 


points.) This means'that the nature of the events waited for at different wait | & 


points will in general be different. Moreover, the corresponding sending pro- 
cesses are not likely to be the same either. Basic to all of this discussion is : 
the following: For each distinct wait point of a process it must be assured by 
prior arrangement that some process (at least one) will send a wakeup message 
when the looked-for event has occurred. Suppose then a process that has several 
distinct wait points reaches one of them, The process must now be prepared to 
wait, if necessary, fora particular message to arrive (possibly from a particu- 
lar sender), Of course, there need not be any waiting at all if the wanted mes- 
sage has already arrived. However, whether or not waiting is required, either 
upon reaching the wait point or after being awakened following a suspension, the 
suspended process must be sure that the right message has been received before | 


continuing with its active efforts. 


What is involved in searching for the right message? If one pictures that 
the receiving process has a single mailbox for all messages, then determining 
if the right message has arrived is simply a matter of scanning the contents of 
one mailbox--either by an indexed or by an associative search, depending on the — | 
data structure of the mailbox. But, wait! Won't protection considerations dic- © 
tate that there may be at least one mailbox per ring of the process? The answer | 
is yes, but since the reasons are secondary to our main line of thought here, the 


explanation is left to our footnote” 


Are there cases where one mailbox per ring would be insufficient? In 
principle perhaps the answer is no, provided each message fully identified the 
wait point, the sender, and the exact time that the mes sage was sent. In prac- 
tice, however, the Multics designers have chosen to implement the system in 
sucha way that several different mailboxes per ring are available. For example, 


a process may contain a programmed wait point that asks to wait for receipt of 


* A one-mailbox approach would mean that a ring-32 procedure, for instance, 
could read mail intended for a ring-1 procedure! Clearly, this is unacceptable. 
So we must picture a process arranging for the receipt of mail in different, ring- 
related mailboxes. In this way, a ring 1 procedure can scan mail ina ring-l 
mailbox (and even ina ring-32 mailbox if desired), but a ring- 32 procedure would 
be able to scan mailboxes ony in rings >32. 


a message in any of a given list of designated mailboxes. We shall discuss these 
dees in more detail later. We mention them here only to motivate the notion that 
@ a process may have what amounts to sets of mailboxes (each mailbox possibly 
empty), one set per ring.“ Clearly, each mailbox must bear a unique designation 
within the receiving process so that a sender can transmit his messages to their 


proper destination, 
7.5.2.1 Messages, Mailboxes (event channels) and Transmission 


The technical name used in Multics for a mailbox is event channel. An 
event channel is uniquely designated by a 72-bit identifier t, This name is gener- 
ated by the system as a result of executing a user-written subroutine call for the 


creation of an event channel.tt 


Origin of the Message 


By added convention, every message originates as a 72-bit item of arbitrary 
content (set by the sender). (However, inthe course of transmitting the mes sage, 


system routines expand it with self-identifying information. ) 


A message is sent in the form ofa call to the hard-core system routine, 
he s $wakeup: | 


call hes _$wakeup (receiving process id, 
channel _name, 
message, 
code); 


a nae nels ne ele A en far sneering ‘ren mane | 


} ‘ 
* There is one byproduct benefit that comes from the implementation decision to have multiple 
mailboxes per ring. Let the distinct wait points in some ring-r of a process A be designated 
as wpl, wp2,..., etc. Suppose the wait at each of these points is for a message from a cor- 
respondingly different process, e.g., from processes pl, p2,..., etc. Prior arrangements 
between the process pairs (A, pl), (A, p2), in ,»p3), etc., for the sending of messages to A 
need not be fully coordinated in the sense A is not forced to give (or to divulge) to pl, p2, p3, 
etc., the very same mailbox name. One can regard this flexibility as an advantage in that 
there may be less risk of confusion if separate senders are asked to send messages to as 
ent mailboxes, with each mailbox having a different meaning. 


t The substructure of the event channel name includes three items, a ring number, a key 

(52 bits), and an ECT address. The key is a unique name representing the wall clock time 

at which the event channel was created for this process, The ring number identifies the ring 
in which the receiver expects to examine messages placed in this channel. Received messages 

are saved (until inspected) in a one-per-ring segment called an ECT (Event Channel Table). 
The ECT address is simply the offset within this segment at which the possibly-queued mes- 

sages for this channel may be found, A channel is in effect a FIFO list. Details of the ECT > 

data structures should be of no interest to users, They may be found in BJ. 10.02. 


ttDetails on how to create event channels may be found in BJ.10.01. 


7-53 


Note that although the actual text of a message is small and fixed in size, it is 


large enough to be used as a pointer to messages of arbitrary size. We defer 


momentarily answering the obvious question, namely, how will the sender know : © 
both the process _ id of the receiver and its receiving point (the event channel 
name), This matter is taken up in the section entitled Setup for Interprocess | 


Communication, 


The hcs $wakeup routine makes some simple (routine) checks on the first 
two arguments so that if they are obviously erroneous”, due to programmer 
error, the caller can be alerted if he chooses to examine the returned error code, 
After this partial validation, the Traffic Controller's wakeup entry is called, at 
cocenweaeenmanemansanainnaay” | 
which point steps are taken to forward the message to the intended receiver, as 
outlined below, | | | | 
1. If the message indeed has a receiver, then it must be possible 
to match the receiver process id with the id of one of the processes 
that now have entries in the Active Process Table. Failure to 
find such a match means that the message is meaningless. Such 
a case results in an appropriate error code being reflected to 
hcs $wakeup's caller. (Note that a post-office analogy to this 


case is -- "addressee unknown at this address -- return to 
sender", ) 


2a. A message aimed at a bona fide target process will be copied 
into a ring zero system table where it is properly augmented 
with "truthful" information about the sender. The system table — 
(central storage) is called the ITT (for Interprocess Transmis- 
sion Table). The receiving process will later fetch the message 
out of this table. | 


2b. The last step is to call the Traffic Controller at its entry point 
wakeup to wake up the EeGesyane process. 
The above steps are summarized in the Figure 7-9 flow chart. Note that 
he s $wakeup serves as the user's only interface with the otherwise inaccessible 
wakeup entry in the Traffic Controller, Protecting this entry from direct user 
calls simplifies the logic of the Traffic Controller which, because it is locked 
to all other processes wken entered, must be kept as simple (and fast-executing) _ 


as possible. 


* For example, if the process id or certain of the subfields of the 72-bit channel 
name are zero, this is clearly an error. | : 
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Receivers must be protected against receipt of false messages, whether © 


accidental or intentionally sent. Certain information about the sender is there- 


fore added to the copied message that is placed inthe ITT. This information, 


which is of critical laa Aedes for the protection of the receiver, consists of the 


eee UIA = COR ETRE = CEE = CE 


(i,e., the ring in ane this call was made). The user cannot be trusted to ane 


mit these items accurately. Figure 7-10 shows the message format as stored in 


the ITT. The Interprocess Transmission Table is a wired-down system table in 


ring-0 that is large enough to hold messages for all known processes. The table 


is organized as a set of message queues, one per process, The head of each 


queue is pointed to from a fixed position in the APT entry for the corresponding © 


process, so that when any process re-enters the running state in the Traffic 


Controller, as a result of being awakened, it can quickly determine if there have 


been any messages deposited in the ITT on its behalf since the last time it ran, 


Traffic Controller's 


entry point 


wakeup 


e , 
| /Does receiving process id S 
match one of the id's of a \| No | et error 
process in the APT (that is code ape 
not stopped) ? |propriately 
Yes 
2 


Make a copy of the message 
augment it with information 
about the sender, and place 
it in a central store-and- 
forward table (ITT) 


fw. 


ake up the receiving process| 


Figure 7-9 Sone details of the Traffic Controller's Bite 
point wakeup 
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event channel 72 bits 


“name 
text of the 72 bits 
message 
sender's process id 36 bits 
sender's — 36 bite 


ring 


The zero 
means a process 
message as Op- 

posed to a device 
signal 


Figure 7-10 Message format augmented with information about 
the sender as placed in the ITT 


Getting the message from the central store-and-forward point to the receiver 


So far we have considered mainly the mechanics of sending a message as 
far as a central forwarding center. Ina postal system analogy such as shown 
in Figure 7-11, this is the halfway point, e.g., a regional post office. No ordinary 
citizen is able to walk up to this center and ask for his mail. Nor, by analogy, 
can the Multics user expect to get his mail by attempting to read messages while 
they are still in the ITT. He needs help in moving the messages to data areas that 
are ring-accessible for his purpose. While the post office automatically pushes 
the mail through to its receiver from the central p.o., without any special coax- 
ing, the Multics analogy is somewhat different. Here some initiative is always 
taken by the receiver to pull the message(s) out of central storage and to place 
them into the individual riny-accessible event channels of the process. Recall 
that a receiver's process w:ll have a table of one or more event channels (an ECT) 


in every ring in which there occurs a distinct wait point in that process. 
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Figure 7-11 Postal system analogy to Multics interprocess 
message transmission | 


A wait point is always programmed as a call to ipc$block which iS tive 


entry point in the so-called wait coordinator, the heart of the interprocess com- 


munication facility. A user program should call this entry point whenever it 


must enter the blocked state while awaiting the receipt of a message. 


The form of this call is 
call ipc$block (wait list ptr, message ptr, error code); 


A list of one or Xa pointer supplied as an input 
more event- argument that specifies the 
channel names location where the caller 
| expects to receive a message 
which he can examine 


When called, ipc$block scans the event,channels in the list pointed to by the fine 
argument. Scanning of the channels is done in the listed order, and if lucky enough 
to finda message in one of these channels, ipc$block transfers the first such mes- 
sage found into the location given by the second argument, and returns to its caller. 
The message that is actually transferred consists of the six-word message whose 
format was shown in Figure 7-10, augmented by a seventh word consisting of the 
wait-list index. Thus, if a message is found in the third of eight channels ona 


wait list, the seventh word of the returned message will have the value 3. . « © 


Note that ipc$block has been executing in the ring of its caller. (Ipc$block 
has ring brackets which are (1, 63, 63).) Corisequently, this procedure does not 
have ring access for scanning central storage (i.e., the ITT) which may have re- 
ceived one (or more) of the desired messages. Therefore, ipc$block is forced to 
call a privileged routine in ring-0 (at an entry point hcs $block). This routine in 
effect transfers all valid messages that have accumulated in the ITT event queue 
for this process. Each message is placed in the event channel that is designated 
in that message. Invalid me3sages, such as those whose channel names do not 
match existing channels in the receiving process' ECT's, are summarily discarded. 
If, in the course of making these message transfers, not a single message was 
transferred into a ring > the validation ring (i.e., that of ipc$block), then it is 
clear the wanted message cannot have yet been received. Hence hcs $blcck, 


which is fully privileged to do so, calls the corresponding entry in the Traffic 
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Controller to give away the processor. * If, on the other hand, at least one such 
message was moved to a ring that is accessible to ipc $block, then hes $block 
will pecuvaso that the former can again scan its given list of channels in hopes 
of finding the wanted message. The chain of calls we have just discussed is sum- | 


marized in paBane 7-12. 


If we were to follow the return path from the TC backwards sawed the 
point of call to ipc$block, it becomes easy to see how a fresh message, received 
at ipc$wakeup can be thought of as being forwarded from central storage to the 


appropriate event channel of the receiver. 


It should be recalled that when an awakened process finally recaptures a 
processor, effective execution will resume as a return from the block entry in 
the TC to its caller, hes $block. The latter then transfers!" all newly arrived 
messages from its ITT event queue into the appropriate event channels. If no 
messages were transferred into rings > that of the caller, ipc$block, then 
the process cannot have received the message it was waiting for. Hence, 
hes | $block again calls the TC at entry point block to give away the processor. 


But if at least one potentially suitable message was ‘transferred from the ITT, 


| hes $block returns to its caller (ipc$block), (This is how the pulling of mes-_ 


sages is done--in the absence of an explicit effort, e.g., a call to ipc$block, 
messages for this process can in principle pile up in central storage without ever 
being drawn out.) Note that the return to ipc$block is no guarantee of a return 

to its caller. If ipc$block finds no message in one of the listed event channels, 


it simply recalls hes$block, 


7, See e. a for Interprocess Communicationt 


Here we shall discuss how a sender learns the identification of a receiver 
process and the identification of that receiver's event channel, For convenience 


let us adopt the following notation, 


* To be absolutely precise about things, there is still a possibility for a last minute "reprieve", 
if anytime up to the very last instant before giving the processor away a wakeup arrives, con- 


_ trol will return to the TC's caller, For more details you could review J. H. Saltzer's Ph. D. 


thesis or BJ. 3.01 to see the functicn of the so-called ''wakeup waiting'' switch. 


+ For a more basic discussion of this topic, the reader may wish to examine the paper, "The 


Multics Interprocess Communication Facility", by M. J. Spier and E, I. Organick, submitted 
for presentation at ACM's Second Symposium on Operating Systems cc ale Princeton 
University, Princeton, New Jersey, October, 1969, 


ipc$block {wait list ptr 


message ptr, 
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START 


Executes in 
'ring of its 
caller' 
(1 thru 63) 


Ring- 


TC's block 
entry 


Any events on this 
process' event queue 
in the ITT? 


No 


Give away the | 


processor _ | 


Scan event 
channels in 
the listed order 


Message found? = 
Call 
hcs$block 


Transfer it 
from the 

channel to 
message ptr 


Set return 
status in 
code 


he s $block 


call TC entry 
block (ptr) 


Transfer event messages 
from ITT queue to in- 

dividual channels in the 
per-ring ECT's 


Illustrating how event messages are pulled 


out of the ITT queue and distributed to individual 


event channels-in the user rings. (Readers 
should note that Figure 7-15 is a more complete 
description of ipc$block. ) 7 


Figure 7-12 The chain of calls: -»ipc$block —~»hcs$block -~™ TC block entry. 


Let B-to-A setup info be that basic information that is required by a sender 
process B so that it can send a message to a receiver process A, This informa- 


‘tion consists of A's 36-bit process id and A's 72-bit event channel name. 


Let p(A) and p(B) refer to the people responsible for programming A and B, 
respectively. (To be sure, they may be the same individual, wearing ''two hats". ) 
Clearly, the system-provided message transmission facility (IPC) cannot be em- 
ployed to transmit B-to-A setup info, else why would setup info be needed in the 
first place. Note also | that oa) cannot supply 7 with the B-to-A setup info by 
This is because a process id is a clock-dependent unique reer nace pea 
ated by the system at process creation time. Furthermore, A's event channel 
name, which is also a clock-dependent unique bit string, will not be known until 
after A's declaration that creates the event channel has been executed. There 
appears to be ona one Poueiare plan for passing setup info. The plan is as 


follows: | 


| | | | 
(1) p(A) and p(B) agree in advance on the (unambiguous) name of a. 
segment that is to be e shared by A and B. Call this segment 
<shared> Also agreed upon is an offset within < shared >; 


call it [setupBA] which is to be regarded as a 3- word mailbox 
initially set to zero. 


(2) After A and B have been created, and after A has declared 
(created) the appropriate event channel, A places in 
shared$setupBA the desired setup info. 


(3) B fetches the three words at shared$setupBA, and if non-zero, 

assumes by convention that the required A-to-B setup informa- 

tion has been obtained. 
Note that if A is also to become a sender to B (not just a receiver), then A-to-B 
setup (as opposed to B-to-A) is also needed. This info can be sent by a similar 

| : | , | 

prearrangement, although in fact a form of "boot strapping'"' can now be achieved 
if it is desired, to avoid further use of <shared>. That is, the first message B 


sends to A can, by further convention, contain the A-to-B setup pau teens 


But how does A know its B- to-A setup info so aren it can place it at 


| shared$setupAB? 
How A obtains its own process id S id 


When a process is created, one of the temporary segments that is ipeated 


for it and placed in its process directory is called < peOcenas info>. This segment: 
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contains special information about the process that can be read by all procedures. 


ok 


Among these is the process id which is stored in process info$processid.* Any 


user procedure can snap a link to and read this word. 


How A obtains an event channel name 


A user creates an event channel simply by calling ipc$create ev_chn. The 
first of two return arguments in the call contains (upon return) the 72-bit name 
that the IPC has established for this channel, Henceforth, it is the user's respon-| 


sibility to keep track of this name. 


Summary 


The steps that would be coded by p(A) and p(B) to establish interprocess 


communication with B as a sender and A a receiver can now be summarized. 
Ls p(A) codes the following steps in some procedure of A: 


(a) call ipc$create ev chn(channelBA, code); this call creates 
an event channel which can hereafter be referred to by the 
name, channelBA, because the value of the ret return ateus 
ment is a 72-bit unique id of channelBA. : 


(b) assign to the 3- word mailbox at share$setupBA values of 
| process _ info$processid and channelBA. Illustrative epl 
coding is Hpsovaced in the accompanying footnote.T | 


2. p(B) can code B to pick up the required setup info at any time and | © 
use it to send a two-word message to A, Illustrative epl coding | 


* Some consideration is currently being given to merging <process info> with another segment 
in order to reduce the working set. It is for this reason that this argument was not listed in 
Table 7-2. If this change should be implemented, however, the name, process info, will still 
be used in referencing the process id. | 7 


A oR ota een te mB. ae ee ee meee AE RI yn naam He bale meee ee es renee 


a fr ok this might be accomplished with coding that eeiies ona 3- and Baeed: structure for 
a mailbox: 


dcl 1] asbexs based (p), 
2 pid bit (36) /* a process id*/ 
2 chname fixed bin (71) /* a channel name */; 


Then, the executable code which would follow creation of the desired event channel might tee 
like: 


P = addr(shared$setur BA); 


P -> mailbox3. pid = process info$processid; 


P -> mailbox3. chname -: channel BA; 
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is provided in the accompanying footnote. 7 


@ | ce p(A) is now able to code appropriate calls to ipc$block at various 
points in A, to wait for messages from B. A call of the form 


call ipc$block (argptr, msgptr, code); 


gets the job done on the assumption that the first two arguments 
are pointers to the base of structures, the first being toa wait 
list of channel names, and the second to an area of sufficient size | 


for receipt of the message. 


The general structure for the wait list is of the form shown in Figure 7-13. 


| n |——s~ number of channel names on this list 
lst | ; | 
: channel name See mene 
2nd 
channel name 
| eo nth 
@ a = channel name a ete 
me ei: 4 a value of 
7 | | ; channelBA 
(a) general form of a (b) appearance of wait list 


wait list for example in the text. 


Figure 7-13 Wait lists. General and Specific. 


I 


* We shall assume that B also uses a declaration for a 3-word mailbox identical to the one in 
the preceding footnote. Then coding in B might appear as: | 


p = addr(shared$setupHA); 
receiver pid =p -> miailbox3. pid 
channel name =p -> mailbox3, chname 
if 4(receiver_pid = 0 and channel name = 0) 
then call hcs$wakeup (receiver pid, channel name, message, code); 


@ 3 else call print ercor; 


Here message is a 72-bit message and print error might be a routine to print an appropriate 
error message before proceeding with whatever steps are then deemed appropriate. 
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195-3 _ Programming of a Multi-purpose Process 


A Multics process is basically sequential in nature by virtue of the fact that 
but a single execution point (or point of control) is free to traverse over its ad- | 
dress space at any one time. For this reason, it is natural to think of such a 


process as having a single purpose. 


If twe or more independent computations are to be performed, albeit related 
to one another, it is entirely appropriate for the programmer to create a separate 
process, one per each defined purpose, and have these processes execute in any 
interrelated fashion that seems appropriate. In fact, this approach is recommended 


for most initial efforts of this kind. 


Subject to processor availability, concurrent computation of the separate | 
but related processes may occur in some fashion, but it is not predictable, of 
course, since the Traffic Controller and its functions are outside the control of 
the programmer. In any case, by proper use of IPC, the separate processes 


(purposes) may synchronize with one another. 


It is worth noting, however, that the establishing and maint aining of separate 
address spaces, one per process, incurs an appreciable system overhead, Such 
costs are ultimately passed on to the user directly or indirectly. Hence it may 
well be worth considering under what circumstances it is feasible to coalesce (and 
condense) the address spaces of several processes into a single, now multi-purpose 


process having one address space (and one execution point). | 


Certainly, it is necessary that concurrent pursuit of the separate purposes, 
ive. , parallel executing of the separate tasks be no requirement, (But, then such | 
a requirement, even without coalescing, cannot be guaranteed in Multics anyway. ) 
Beyond this, the order in which these tasks may be initiated and executed should 
in some sense be of secondary importance and perhaps be independent of the tasks 
themselves. This requirement may be satisfied in the case where events exter- 
nal to the process drive the multi-purpose process. That is, IPC messages. 
received by the process are the basis for deciding which task to execute next. 
Examples of multi-purpose processes are common among system software pro- 
Cheeee, Cs Liew answering services, 1/O device managers, automatic reco:ders, 


automatic file dumpers, etc. 


A process that must behave in multi-purpose manner, can in principle be 


_coded using flow chart logic described in Figure 7-14, The basic idea suggested 
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in the figure +5 to create n event channels, one for each of the n distinct purposes — 
| ® of the process. After furnishing setup information for use of each of the n chan- 

@ nels to the n senders (not necessarily n different processes) the process calls 
ipc$block to await one of the n types of events. Each time an event arrives, 
ipc$block returns with a message. The message is examined (in box 5 of the 
flow chart) to determine which channel (type of event) has occurred so as to invoke 
the associated task. When the task is completed, a call is again issued to 
ipc$block. 

We have not yet defined what we mean by a task. The simplest idea is to 
suppose it corresponds to a call toa procedure that is associated with the corres- 
ponding event channel. We will be interested in understanding what restrictions 
are imposed, if any, as to what may go on inside the called procedure. For 
instance, are calls to ipc$block to be permitted from within the associated pro- 
cedure and/or from any of its dynamic descendents? We shall consider this 
possibility in the next subsection. — For the moment, however, wevehall assume 


such repeated calls do not happen. 


7.5.3.1 Event call channels | 


@ Note that further logical simplification (from the point of view of the user) 

arises anda slight increase in efficiency can be gained if the control logic of © 
boxes 3, 4, 5 and 6 are made part of ipc$block. At the top-most logic level the | 
process would be characterized simply as the execution of boxes 1 and 2. That 
is, initializing of channels, transmitting of setup information, etc.; followed by 
a single call to ipc$block,. There would be no return and of course no repeated 
calls on ipc$block. Of course, it would be necessary to furnish IPC with more > 
‘nformation so it can perform its more elaborate job. Basically, this amounts 


to telling IPC what are the procedures that should be called (invoked) upon receipt 


of respective messages. 


The ''simplification" we have been discussing is in fact provided for in Mul- | 
tics by allowing the user to designate event channels of his choice for special 
interpretation, Event channels marked in this fashion are referred to as event 


call channels, as opposed to the ordinary event wait channels. Messages found 


in event call channels are examined and interpreted while the process is executing 
inside the IPC. Interpretation amounts to execution of a call to the associated 


procedure. 
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Figure 7-14 A possible structure for a multi-purpose process 
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7.5.3.2 Concept of the Wait Coordinator 


@ As can now be seen, the code associated with entry point ipc$block is in 

fact more sophisticated than a simple scanner for messages received in event 
channels, since some action decisions (i.e. , interpretation) are in fact delegated | 
to this procedure. The code is referred to in MSPM documentation as the Wait 


Coordinator*, and aptly so. 


Once an event channel has been eneanee: a programmer is free to declare, 
by a call to an appropriate IPC entry point, that said channel is thereafter to be 
regarded by the Wait Coordinator as event call type. Subsequently, when the 
process is executing inside the Wait Coordinator a scan of event channels that 
turns up a message in an event call channel will trigger a call to the associated 
procedure. It should be stressed that return from this call is to a return point 
within the Wait Coordinator. The net effect therefore, is that, in the case of 
event call channels, the action of boxes 5 and 6 of the Figure 7- 14 flow chart is 
accomplished se ede i.e., on behalf of the procedure that calls the Wait Co- 


ordinator. 


Whena Multics user wishes to establish an event channel to be of the call 


type, he takes the following action: 


(1) Create the event channel by a call to ipc$create ev_chn, (This © 
step sets up the channel, but its default interpretation is of the 
event wait type, i.e., while given this interpretation it may 
only be used as pictured in Figure 7-14. ) 


(2) Declare said channel to be of the event call type by a call to | 
ipc$decl_ev_call_chn, The form of this call is | 


So NNO 7 
eels from step (1) 


call ipc$dec]_ ev_call chn (channel name, associated_ 


procedure _ ptr, data ptr, priority, code), 


The second and third arguments in this call are saved in ene 

event channel table (ECT) for later use by the Wait Coordinator 

so it can construct the desired call to the associated procedure. 
Since a user is free to declare more than one event channel to 

be of the call type, it is necessary to previde the Wait Coordinator 


*“ The principal MSPM reference is BJ.10.03. 
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a scanning order for these channels. The user furnishes an 
integer argument, priority“, to be used by the Wait Coordinator 
as a scanning index. Thus, if channel billy and channel tilly 
are both declared to be event call channels and priorities 2 and 
! are associated with them; respectively, then if messages have 
been received on both channels, the procedure associated with | 
channel tilly will be called before the one associated with 
channel billy. 


The next three subsections (7.5.3.3, 7.5.3.4, and 7.5.3. 5) round out the 
design details of the Wait Coordinator that may be of interest to some readers. 


They can easily be skipped on a first reading. 
ie 0toe2 Call- Wait Polling Order 


Although we have just suggested the rule for scanning event call channels, 
we have yet to explain the dependency relationship that exists between rules for 
scanning event call channels and those for scanning event wait channels. Each 
call to the Wait Coordinator (in reality a call to ipc$block) is in fact a request to | 
scan, not one, but two lists of channels, the wait list, and the call list. The 
wait list is the list of event wait channels that is pointed to (first argument) in 
the call to the Wait Coordinator. The call list is the list of event call 
channels that are currently kept in the ECT for the ring of the Wait Coordinator's 
caller. | | 

We shall say that a W-C polling order is one in which the wait list is scanned @ 
first and then the call list, while a C-W polling order is the reverse (i.e., call list | 
before wait list), The system's default polling order is W-C, but a user is given 
the opportunity, by calling a special entry point in the IPC, to reverse the current | | 


polling order. | 


| It should be remembered that whenever a message is found in a channel of 
the wait list, channel scanning ends immediately. The discovered message (aug- 
mented by the wait list index) is copied into the caller's message area, and the 
Wait Coordinator returns to its caller. This means that when functioning in the 
default (W-C) polling order, the event call list is scanned only if no event wait 


message is found. 


= Strictly speaking, this argument is a priority level, the lower the integer (level), 
the higher the priority. 7 | 
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a 


Whenever a call list is scanned, it is scanned in its entirety. Each event 
call channel is inspected for a message. If one is found, the associated procedure 
is called and, following a return from this call, if any, the next event call channel 
in the list is inspected, etc. The above scanning logic is summarized in the 
Figure 7-15 flow chart. 


C96 354 Invoking an associated procedure and controlling its repeated use 


A designer ofa subsystem that uses an event call channel will, of course, 
be required to designate the name (or pointer) of the associated procedure at the 
time he declares the channel to be of event call type. The same designer may 
also be required to code this procedure. (We will call it AP, for convenience. ) 
The Wait Coordinator, when it issues the call to AP, will always use a standard 
form for this call. AP's author must therefore code it so that it is compatible 


with this standard call. The rules are explained in the accompanying footnote. 


sk 


Several messages can be queued up over the same event call channel. But. 
the Wait Coordinator must see to it that it treats only one message at a time (the 
top most). It should not recognize the next message in the queue until processing 


of the top most is completed. This means that an associated procedure must 


return control before the Wait Coordinator can permit itself to again inspect the 


same event call channel. The following paragraphs show why the controls are 


needed and how they are achieved. 


It is easy to see how the Wait Coordinator could get into this situation. 
Suppose API is called for the first message of a call channel "'1'' and suppose 


that during its execution AP] must call ipc$block to await a message on some 


event wait channel. Further, suppose this message has not arrived at the time 


of this fresh call to ipc$block. The Wait Coordinator might then find itself scan- 
ning channel | once again, and if it finds another message (assuming no controls 


were set to prevent it), would call AP1 once again. We would then have a 


Ane ee eee ee eee 


* Conventions used in calling associated procedure (AP) are: 


1 The AP has one argument, message ptr, which is a pointer to an eight- word based 
structure, the first six words of which are as given in Figure 7-10. 


ae Before issuing the call, the Wait Coordinator appends to these six words a word pair 
whose value is that of data ptr. This item, you recall, is the third argument fur- 
nished in the declaraticn of the event call channel. “he data __ ptr can, therefore, be 
thought of as an ordina”y arglist ptr of a procedure, except it is available only in- 
-directly in the message argument. In addition, of course, AP has access to the | 
first six words of the message area, which is also useful information, 
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(3) 


| 
| 
| 


b. the last thing it does; prior to returning; is to execute an 
ipc call that unmasks all event call channels, 1.€+> 


leas ~ eall ipc$unmask_ev_calls (code); , - | : @ | 


This approach has the merit that any task that is invoked is 

treated as having absolute top priority. In a sense, it can 

be regarded as an extreme approach to solving the problem. 
Figure 7-18 illustrates what we mean. Here we show the 
effects of masking and unmasking event call channels using 
the same event timing sequence a® in Figure 7-17. Note 
that task EC4, because of its low priority, is not even begun 
during the time span being considered. 


This approach will no doubt prove to be the most attractive: Give 
up trying the multi-purpos® approach in the first place! Go back 
to the principal-design approach of Multics and let each task that 
is now programmed as an event call task and in need of better Te- 
sponse from its Wait Coordinator be made into an independent 
process to accomplish the same objective. Each such created 
process would have 4 single event call channel over which it can 
be signalled. Hence, competition for good response by its Wait 
Coordinator will now be eliminated. | 


Bear in mind, however!» that after establishing separate processes 


for the several tasks, these can now, in principle anyway» be 


executed in parallel, whenever two OF more processors can be > 
awarded to these tasks during 4 single period of time. The mere 
fact that execution can proceed in parallel as 4 result of following 
this approach carries with it the need for greater care in the 


handling of shared data segmentS.- | ous _— 
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a message in any of a given list of designated mailboxes. We shall discuss these 
ideas in more detail later, We mention them here only to motivate the notion that 
a process may have what amounts to sets of mailboxes (each mailbox possibly 
empty), one set per ring.” * Clearly, each mailbox must bear a unique designation 


within the receiving process so that a sender can transmit his messages to their 


proper destination, 
VeOeGeck Messages, Mailboxes (event channels) and Transmission 


The technical name used in Multics for a mailbox is event channel. An 
event channel is uniquely designated by a 72-bit identifier t, This name is gener- 
ated by the system as a result of executing a user-written subroutine call for the 


creation of an event channel.tt 


Origin of the Message 


By added convention, every message originates as a 72-bit item of arbitrary 
content (set by the sender). (However, in the course of transmitting the message, 


system routines expand it with self-identifying information. ) 


A message is sent in the form of a call to the hard-core system routine, 
hes $wakeup: | 


call hee _$wakeup (receiving process id, 
channel _name, 
message, 
code); 


oe Ne eine replant eee Hater Ae vem cete 


j , . 

* There is one byproduct benefit that comes from the implementation decision to have multiple 
mailboxes per ring. Let the distinct wait points in some ring-r of a process A be designated 
as wpl, wp2,..., etc. Suppose the wait at each of these points is for a message from a cor- 

respondingly different process, e.g., from processes pl, p2,..., etc. Prior arrangements 
between the process pairs (A, pl), (A, p2), bs »p3), etc., for the sending of messages to A 
need not be fully coordinated in the sense A is not forced to give (or to divulge) to pl, p2, p3, 
etc., the very same mailbox name. One can regard this flexibility as an advantage in that | 
there may be less risk of confusion if separate senders are asked to send messages to differ- 
ent mailboxes, with each mailbox having a different meaning. 


tT The substructure of the event channel name includes three items, a ring number, a key 

(52 bits), and an ECT address. ‘The key is a unique name representing the wall clock time 

at which the event channel was created for this process, The ring number identifies the ring 
in which the receiver expects to examine messages placed in this channel. Received messages 
are saved (until inspected) in a one-per-ring segment called an ECT (Event Channel Table). 
The ECT address is simply the offset within this segment at which the possibly-queued mes- 
sages for this channel may be found, A channel is in effect a FIFO list. Details of the ECT 
data structures should be of no interest to users. They may be found in BJ. 10.02. 


ttDetails on how to create event channels may be found in BJ.10.01. 
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Note that although the actual text of a message is small and fixed in size, it is 
large enough to be used. as a pointer to messages of arbitrary size. We defer 
momentarily answering the obvious question, namely, how will the sender know 
both the process id of the receiver and its receiving point (the event channel 
name). This matter is taken up in the section entitled Setup for Interprocess 


Communication, 


The hcs $wakeup routine makes some simple (routine)checks on the first 
two arguments so that if they are obviously erroneous*, due to programmer 
error, the caller can be alerted if he chooses to examine the returned error code, 
After this partial validation, the Traffic Controller's wakeup entry is called, at 

ed : 
which point steps are taken to forward the message to the intended receiver, as_ 


outlined below. 


Ls If the message indeed has a receiver, then it must be possible 
to match the receiver process id with the id of one of the processes 
that now have entries in the Active Process Table. Failure to 
find such a match means that the message is meaningless. Such 
a case results in an appropriate error code being reflected to 
hes _$wakeup' s caller. (Note that a post-office analogy to this 
case is -- "addressee unknown at this address -- return to 
sender", ) 


2a. A message aimed at a bona fide target process will be copied 
into a ring zero system table where it is properly augmented 
with "truthful' information about the sender. The system table 
(central storage) is called the ITT (for Interprocess Transmis- 
sion Table). The receiving pEOcess will later fetch the message 
out of this table. 7 


2b. The last seep, is to call the Traffic Controller at its entry point 
wakeup to wake up the receiving process. 
The above steps are stnvasniced in the Figure 7-9 flow chart. Note that 
hcs $wakeup serves as the user's only interface with the otherwise inaccessible 
wakeup entry in the Traffic Controller, Protecting this entry from direct user. 


calls simplifies the logic of the Traffic Controller which, because it is locked 


to all other processes wken entered, must be kept as simple (and fast-executing) : 


as possible, | 


* For example, if the process id or certain of the subfields of the 72-bit channel 
name are zero, this is clearly an error, | 7 


Receivers must be protected against receipt of false messages, whether 
accidental or intentionally sent. Certain information about the sender is there- 
fore added to the copied message that is placed in the ITT. This information, 

© which is of critical importance for the protection of the receiver, consists of the 
(i,e., the ring in which this call was made). The user cannot be trusted to trans- 
mit these items accurately. Figure 7-10 shows the message format as stored in 
the ITT. The Interprocess Transmission Table is a wired-down system table in 
ring-0 that is large enough to hold messages for all known processes. The table 

is organized as a set of message queues, one per process. The head of each 
queue is pointed to from a fixed position in the APT entry for the corresponding | 
process, so that when any process re-enters the running state in the Traffic | 
Controller, as a result of being awakened, it can quickly determine if there have 


been any messages deposited in the ITT on its behalf since the last time it ran, 


Traffic Controller's 


entry point 


wakeup 


Does receiving process id 
match one of the id's of a No 
process in the APT (that is 

not stopped) ? 


Set error 


code ap- 
propriately 


Yes: 


Make a copy of the message 
augment it with information 
about the sender, and place 
it in a central store-and- 

forward table (ITT) 


Wake up the receiving process, 


Figure 7-9 Some details of the Traffic Controller's entry 
point wakeup | | 
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event channel 72 bits 


name 
text of the 72 ve 
message 
sender's process id 36 bits 
sender's 36 bits 


ring 


The zero 
means a process 
message as Op- 

posed to a device 
Signal | 


Figure 7-10 Message format augmented with information about 
the sender as placed in the ITT 


Getting the message from the central store-and-forward point to the receiver 


So far we have considered mainly the mechanics of sending a message as 
far as a central forwarding center. Ina postal system analogy such as shown 
in Figure 7-11, this is the halfway point, e.g., a regional post office. No ordinary 
citizen is able to walk up to this center and ask for his mail. Nor, by analogy, 
can the Multics user expect to get his mail by attempting to read messages while 
they are still inthe ITT. He needs help in moving the messages to data areas that 
are ring- accessible for his purpose. While the post office automatically pushes 
the mail through to its receiver from the central p.o., without any special coax- 
ing, the Multics analogy is somewhat different. Here some initiative is always 
‘taken by the receiver to pull the message(s) out of central storage and to place 
them into the individual riny-accessible event channels of the process. Recall 
that a receiver's process w:ll have a table of one or more event channels (an ECT) 


in every ring in which there occurs a distinct wait point in that process. 


U.S. Postal 


System 


local 
(Cambridge) 
Post Office 


C Mail is Central 
posted by Boston 


gender A to store and eo ms a es 
Lender forward ~ _ focal : 
. | oN (Welles ley) | 


Fost a ff ice ! 


\ 
\ 

\ ocai ~ 77 

(Allston) 
iPost Office: 
No yar ste ast aves =| 

| : NN Process D I Process CG D4 

| ~e | - 
: 7” 
MULTICS ™~ | i oe 
2 aS Z 7 
a ™ as Process B 


Process A 


Bic. 
user A sends \__ ipe 


ITT 


ew 
ipc$ (‘wessase received 
blotk \ by B at B's , 
| initiative, i.e., 


a message to B “| wakeup (store and 
via a call to 
ipc$Swakeup | ur ee if B is waiting 
— an | we ' 2 for it via a call 
ie. “gan eee | N to ipcSblock 
”” | as : 
7 ae i Process F es | 
Process E ~ = 
1 
{ 


Figure 7-11 Postal system analogy to Multics interprocess 
message transmission | 
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A wait point is always programmed as a call to ipc$block which is the 
entry point in the so-called wait coordinator, the heart of the interprocess com- 
munication facility. A user program should call this entry point whenever it @ 


must enter the blocked state while awaiting the receipt of a message. 


The form of this call is 


call ipc$block (wait list ptr, message ptr, error code); 


A list of one or © Xn pointer supplied as an input 
more event- argument that specifies the 


channel names location where the caller 
Dee expects to receive a message 
which he can examine 


When called, ipc$block scans the event,channels in the list pointed to by the first 
argument. Scanning of the channels is done in the listed order, and if lucky enough 
to finda message in one of these channels, ipc$block transfers the first such mes-. 
sage found into the location given by the second argument, and returns to its caller. 
The message that is actually transferred consists of the six-word message whose 
format was shown in Figure 7-10, augmented by a seventh word consisting of the 
wait-list index. Thus, if a message is found in the third of eight channels ona 


| wait list, the seventh word of the returned message will have the value 3. : Pa ® 


Note that ipc$block has been executing in the ring of its caller. (Ipc$block 
has ring brackets which are (1, 63, 63).) Corisequently, this procedure does not 
have ring access for scanning central storage (i.e., the ITT) which may have re- 
ceived one (or more) of the desired messages. Therefore, ipc$block is forced to 
call a privileged routine in ring-0 (at an entry point hes $block). This routine in 
effect transfers all valid messages that have accumulated in the ITT event queue 
for this process, Each message is placed in the event channel that is designated 
in that message. Invalid me3sages, such as those whose channel names do not 
match existing channels in the receiving process' ECT's, are summarily discarded. 
i, in the course of making these message transfers, not a single message was 
transferred into a ring > the validation ring (i.e., that of ipc$block), then it is 
clear the wanted message cannot have yet been received. Hence he s $blcck, 


which is fully privileged to do so, calls the corresponding entry in the Traffic 
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Controller to give away the processor. * Jf, on the other hand, at least one such 


message was moved to a ring that is accessible to ipc $block, then hes $block 
will return so that the former can again scan its given list of channels in hopes 
of finding the wanted message, The chain of calls we have just discussed is sum- 
marized in Trego 

If we were to follow the return path from the TC backwards toward the 


point of call to ipc$block, it becomes easy to see how a fresh message, received 


at ipc$wakeup can be thought of as being forwarded from central storage to the 


appropriate event channel of the receiver. 


It should be recalled that when an awakened process finally recaptures a 
processor, effective execution will resume as a return from the block entry in 
the TC to its caller, hes $block. The latter then ''transfers" all newly arrived 
messages from its ITT event queue into the appropriate event channels. If no 
messages were transferred into rings > that of the caller, ipc$block, then 


the process cannot have received the message it was waiting for. Hence, 


— hes | $block again calls the TC at entry point block to give away the processor. 


But if at least one potentially suitable message was transferred fromthe ITT, _ 
hes $block returns to its caller (ipc$block), (This is how the pulling of mes-_ 
sages is done--in the absence of an explicit effort, e.g., a call to ipc$block, 
messages for this process can in principle pile up in central storage without ever 
being drawn out.) Note that the return to ipc$block is no guarantee of a return 

to its caller. If ipc$block finds no message in one of the listed event channels, 


it simply recalls hcs$block, 


ice lee * Setup for Interprocess Communicationt 


Here we shall discuss how a sender learns the identification of a receiver 
process and the identification of that receiver's event channel. For convenience 


let us adopt the following notation. 


* To be absolutely precise about things, there is still a possibility for a last minute "reprieve", 
if anytime up to the very last instant before giving the processor away a wakeup arrives, con- 


_ trol will return to the TC's caller, For more details you could review J. H. Saltzer's Ph.D. 


thesis or BJ. 3.01 to see the functicn of the so-called ''wakeup waiting'' switch. 


+ For a more basic discussion of this topic, the reader may wish to examine the paper, ''The 


Multics Interprocess Communication Facility", by M. J. Spier and E, I, Organick, submitted 
for presentation at ACM's Second Symposium on Operating Systems eESHESD Tee Princeton 
University, Princeton, New Jersey, October, 1969, | 


ipc$block {wait list ptr 
message ptr, 


code) — | - | | 7 | } 


START 


Scan event 
channels in 
the listed order 


Executes in 
'ring of its 


caller! 
(1 thru 63) 
| Transfer it 
Message found? Yes from the ee 
channel to ae 
code 
message ptr); === 
Call | | 
hcs$block . 
hcs $block 
Ring - 


call TC entry 
block (ptr) 


TC's block 


Transfer event messages 
from ITT queue to in- 

dividual channels in the 
per-ring ECT's | 


Any events on this 
process' event queue 
in the IT1? 


Illustrating how event messages are pulled 


| | out of the ITT queue and dist1ibuted to individual 
Give away the | event channels in the user rings. (Readers 
poorer 3 should note that Figure 7-15 is a more complete — 


description of ipc$block. ) 


| Figure 7-12 The chain of calls: -»ipc$block —»hcs$block TC block entry. 
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Let B-to-A setup info be that basic information that is required by a sender 
process B so that it can send a message to a receiver process A, This informa- 


tion consists of A's 36-bit process id and A's 72-bit event channel name. 


Let p(A) and p(B) refer to the people responsible for programming A and B, 
respectively. (To be sure, they may be the same individual, wearing ''two hats". . 
Clearly, the system-provided message transmission facility (IPC) cannot be em- 
ployed to transmit B-to-A setup info, else why would setup info be needed in the > 
first place. Note also that p(A) cannot supply p(B) with the B-to-A setup info by 
telephone or by other direct personal communication until after A has been created, 
This is because a process id isa clock-dependent unique bit string that is gener- 
ated by the system at process creation time. Furthermore, A's event channel 
name, which is also a clock-dependent unique bit string, will not be known until 
after A's declaration that creates the event channel has been executed. There 
appears to be out one sensible plan for passing pee info. The plan is as 
follows: 

(1) D(A) and p(B) agree in advance on the (unambiguous) name of a. 

segment that is to be shared by A and B, Call this segment 
<shared>. Also agreed upon is an offset within < shared >; 


call it [setupBA] which is to be regarded as a 3- word mailbox 
initially set to zero. 


(2) After A and B have been created, and after A has declared 
(created) the appropriate event channel, A places in 
shared$setupBA the desired setup info. 


(3) B fetches the three words at shared$setupBA, and if non-zero, 
assumes by convention that the required A-to-B setup informa- 
tion has been obtained. 
Note that if A is also to become a sender to B (not just a receiver), then A-to-B 
setup (as opposed to B-to-A) is also needed. This info can be sent by a similar 
| 7 | — 
prearrangement, although in fact a form of "boot strapping" can now be achieved 
if it is desired, to avoid further use of <shared>. That is, the first mes sage B 


sends to A can, by further convention, contain the A-to-B setup information. 


But how does A know its B-to-A setup info so that it can place it at 


| shared$setupAB? 
How A obtains its own process id 


When a process is created, one of the temporary segments that is «:reated 


for it and placed in its process directory is called <process info>, This segment. 
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contains special information about the process that can be read by all procedures: 
_ Among these is the process id which is stored in process _info$processid. * Any 


user procedure can snap a link to and read this word. 
How A obtains an event channel name 


A user creates an event channel simply by calling ipc$create ev_chn, The 


first of two return arguments in the call contains (upon return) the 72-bit name 


that the IPC has established for this channel. Henceforth, it is the user's respon- 


sibility to keep track of this name. 
Summory | 
The steps that would be coded by p(A) and p(B) to Sepabiteh i uesuepeneese 
communication with B as a sender and A a receiver can now be summarized. 
Ty p(A) codes the eoiiewing steps in some procedure of A: 


(a) call ipc$create ev chn (channelBA, code); this call creates. 

. an event channel which can hereafter be referred to by the 
name, channelBA, because the value of the nee return argu- 
ment is a 72- bit unique id of channelBA. 


(b) assign to the 3. word mailbox at snare pectupes values of 
process info$processid and channelBA. Illustrative epl 
coding is provided in the accompanying footnote, tT 


Ze p(B) can code B to pick up the required setup info at any time and 
use it to send a two-word message to A, Illustrative epl coding 


* Some consideration is currently being given to merging <process info> with another segment 
in order to reduce the working set. It is for this reason that this argument was not listed in 
Table 7-2. If this change should be implemented, however, the name, process info, will still 
be used in referencing the process id. | eae 


Se atten: Meee ie ene ee ee ” 


+ is ol: “this might be accomplished with coding that fetes ona 3- word, iaeed structure don 
a mailbox: 


del 1 mailbox3 based (p), — 
2 pid bit (36) / * a process id*/ 
2 chname fixed bin (71) /* a channel name */; 


Then, the executable code which would follow creation of the desired event channel might iok 
like: 


p = saap(enaredeaeriena): 
P -> mailbox3. pid = prccess info$processid; 


P -> mailbox3. chname = channelBA; 
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is provided in the accompanying footnote. 


@ | Se p(A) is now able to code appropriate calls to ipc$block at various 
points in A, to wait for messages from B. A call of the form 


call ipc$block (argptr, msgptr; code); 


gets the job done on the assumption that the first two arguments 
are pointers to the base of structures, the first being to a wait 
list of channel names, and the second to an area of sufficient size 


for receipt of the message. 


The general structure for the wait list is of the form shown in Figure 7-13. 


| gute number of channel names on this list 


Ist : ; 

t channel name $e pate 

9 2nd 
channel name 

= e nth = 

@ channel name ; 
: 7 value of 
channelBA 
(a) general form of a (b) appearance of wait list 


wait list for example in the text. 


Figure 7-13 Wait lists. General and Specific. 
| | 


* We shall assume that B also uses a declaration for a 3-word mailbox identical to the one in 
the preceding footnote. Then coding in B might appear as: 


p= addr(shared$setupFA); 
receiver pid =p -> mailbox. pid 
channel name =p -> mailbox3,. chname 
if 4(receiver_pid = 0 and channel name = 0) 
then call hcs$wakeup (receiver pid, channel name, message, code); 


@ | else call print er cOYr; 


Here message is a 72-bit message and print error might be a routine to print an appropriate 
error message before proceeding vith whatever steps are then deemed appropriate. 
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12543 Programming of a Multi-purpose Process 


A Multics process is basically sequential in nature by virtue of the fact that 


but a single execution point (or point of control) is free to traverse over its ad- 
dress space at any one time. For this reason, it is natural to think of such a 


process as having a single purpose, 


If twc or more independent computations are to be performed, albeit related | 
to one another, it is entirely appropriate for the programmer to create a separate 
process, one per each defined purpose, and have these processes execute in any 
interrelated fashion that seerns appropriate. In fact, this approach is recommended 


for most initial efforts of this kind. 


Subject to processor availability, Boncuneent computation of the separate 
but related processes may occur in some fashion, but it is not predictable, of 
course, since the Traffic Controller and its functions are outside the control of 
the programmer. In any case, by proper use of IPC, the separate processes 


(purposes) may synchronize with one another. 


It is worth noting, however, that the establishing and maintaining of separate 
address spaces, one per process, incurs an appreciable system overhead. Such — 
‘costs are ultimately passed on to the user directly or indirectly. Hence it may 
well be worth considering under what circumstances it is feasible to coalesce (and — 
condense) the address spaces of several processes into a single, now multi-purpose 


process having one address space (and one execution point). 


Certainly, it is necessary that concurrent pursuit of the separate purposes, 
i.e., parallel executing of the separate tasks be no requirement, (But, then such 
a requirement, even without coalescing, cannot be guaranteed in Multics anyway.) 
Beyond this, the order in which these tasks may be initiated and executed should 
in some sense be of secondary importance and perhaps be independent of the tasks 
themselves. This requirement may be satisfied in the case where events exter- 
nal to the process drive the multi-purpose process. That is, IPC messages 
received by the process are the basis for deciding which task to execute next. 
Examples of multi-purpose processes are common among system software pro- 
cesses, e.g., answering services, 1/O device managers, automatic reco:ders, 


automatic file dumpers, etc. 


A process that must bshave in multi-purpose manner, can in principle be 


coded using flow chart logic described in Figure 7-14. The basic idea suggested 
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in the figure is to create n event channels, one for each of the n distinct purposes © 
| @° the process. After furnishing setup information for use of each of the n chan- 
nels to the n senders (not necessarily n different processes) the | process calls” 
| ipc$block to await one of the n types of events. Each time an event arrives, 
_ipce$block returns with a message. The message is examined (in box 5 of the 
flow chart) to determine which channel (type of event) has occurred so as to invoke : 
the associated task. When the task is completed, a call is again issued to 


ann 


We have not yet defined what we mean by a task. The simplest idea is to 
suppose it corresponds to a call toa procedure that is associated with the corres- 
ponding event channel. We will be interested in understanding what restrictions 
are imposed, if any, as to what may go on inside the called procedure. For _ 

instance, are calls to ipc$block to be permitted from within the associated pro- 
cedure and/or from any of its dynamic descendents? We shall consider this | 
- possibility in the next subsection. For the moment, however, we shall assume 


~ guch repeated calls do not happen. 


7.5.3.1 Event call channels 
® % 4 ‘Note that further logical simplification (from the point of view of the user) 


arises and a slight increase in efficiency can be gained if the control logic of © 
boxes 3, 4, 5 and 6 are made part of ipc$block. At the top-most logic level the 
process would be characterized simply as the execution of boxes 1 and 2, That 

is, initializing of channels, transmitting of setup information, etc. , followed by 

a single call to ipc$block,. There would be no return and of course no repeated | 
calls on ipc$block. Of course, it would be necessary to furnish IPC with more 
information so it can perform its more elaborate job. Basically, this amounts 

to telling IPC what are the procedures that should be called (invoked) upon receipt 


of respective messages. 


The "simplification" we have been discussing is in fact pvadided for in Mul- 
tics by allowing the user to designate event channels of his choice for special _ | 
interpretation. Event channels marked in this fashion are referred to as event. 


call channels, as opposed to the ordinary event wait channels. Messages found 


in event call channels are ex amined and interpreted while the process is executing 


‘nside the IPC. Interpretation amounts to execution of a call to Ene associated 


procedure, — 
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Figure 7-14 <A possible structure for a multi-purpose process 
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‘Pusrecre 4 Concept of the Wait Coordinator 


| @ As can now be seen, the code associated with entry point ipc$block is in 
fact more sophisticated than a simple scanner for messages received in event 
channels, since some action decisions (i.e., interpretation) are in fact delegated | 
to this procedure. The code is referred to in MSPM documentation as the Wait 


Coordinator™, and aptly so. 


Once an event channel has been exenced: a programmer is free to declare, 
by a call to an appropriate IPC entry point, that said channel is thereafter to be 
regarded by the Wait Coordinator as event call type. Subsequently, when the — 
process is executing inside the Wait Coordinator a scan of event channels that 
turns up a message in an event call channel will trigger a call to the associated 
procedure. It should be stressed that return from this call is to a return point | 
within the Wait Coordinator. The net effect therefore, is that, in the case of 
event call channels, the action of boxes 5 and 6 of the Figure 7-14 flow chart is 
accomplished implicitly, i.e., on behalf of the procedure that calls the Wait Co- 


ordinator. 


Whena Multies user wishes to establish an event see to be of the call 


type, he takes the following action: 


(1) Create the event channel by a call to ipc$create ev chn. (This — 
step sets up the channel, but its default interpretation is of the 
event wait type, i.e., while given this interpretation it may 
only be used as pictured in Figure 7-14.) 


(2) Declare said channel to be of the event call type by a call to | 
ipc$decl_ev_call_chn, The form of this call is | 


i ee iin ea a i 


obtained from step (1) 


call ipc$decl ev call chn (channel name, associated 
procedure ptr, data ptr, priority, code). 
The second and third arguments in this call are saved inthe 
event channel table (ECT) for later use by the Wait Coordinator 
so it can construct the desired call to the associated procedure. 


Since a user is free to declare more than one event channel to 
be of the call type, it is necessary to previde the Wait Coordinator 


“ The principal MSPM reference is BJ. 10.03. 


767. 


a scanning order for these channels. The user furnishes an 
integer argument, priority”, to be used by the Wait Coordinator 
as a scanning index. Thus, if channel billy and channel tilly 
are both declared to be event call channels and priorities 2 and > 
l are associated with them, respectively, then if messages have 
been received on both channels, the procedure associated with | 
channel tilly will be ‘be called before the one associated with 
channel — pally: 
The next three subsections (7.5.3.3, 7.5.3.4, and 7.5. 3, 5) round out the 
design details of the Wait Coordinator that may be of interest to some readers. 


They can easily be skipped on a first reading. 
i auc aes ee Call- Wait Polling Order 


Although we have just suggested the rule for scanning event call channels, 
we have yet to explain the dependency relationship that exists between rules for 
scanning event call channels and those for scanning event wait channels, Each 
call to the Wait Coordinator (in reality a call to ipc$block) is in fact a request to | 
scan, not one, but two lists of channels, the wait list, and the call list. The 
wait list is the list of event wait channels that is pointed to (first argument) in 
the call to the Wait Coordinator . The call list is the list of event call 
channels that are currently kept in the ECT for the pane of the Wait Coordinator's 


caller. 


We shall say that a W-C polling order is one in which the wait list is scanned 
_ first and then the call list, while a C-W polling order is the reverse (i.e., call list 
before wait list). The system's default polling order is W-C, but a user is given 
the opportunity, by calling a special entry point in the IPC, to reverse the current 


polling order. 


It should be remembered that whenever a message is found in a channel of 
the wait list, channel scanning ends immediately. The discovered message (aug- 
mented by the wait list index) is copied into the caller's message area, and the 
Wait Coordinator returns to its caller. This means that when functioning in the 
default (W-C) polling order, the event call list is scanned only if no event wait 


message is found. 


. Strictly speaking, this argument is a priority level, the lower the integer (level), 
the higher the priority. | | 
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Whenever a call list is scanned, it is scanned in its entirety. Each event 
call channel is inspected for a message. If one is found, the associated procedure 
is called and, following a return from this call, if any, the next event call channel 


in the list is inspected, etc. The above scanning logic is summarized in the 


Figure 7-15 flow chart. 
Ur ore Invoking an as sociated procedure and controlling its repeated use 


A designer of a subsystem that uses an event call channel will, of course, 
be required to designate the name (or pointer) of the associated procedure at the 
time he declares the channel to be of event call type. The same designer may 


also be required to code this procedure. (We will call it AP, for convenience. ) 


The Wait Coordinator, when it issues the call to AP, will always use a standard © 


form for this call. AP's author must therefore code it so that it is compatible 
with this standard call. The rules are explained in the accompanying footnote. a‘ 


Several messages can be queued up over the same event call channel. But 


the Wait Coordinator must see to it that it treats only one message at a time (the 


_ top most). It should not recognize the next message in the queue until processing 


of the top most is completed. This means that an associated procedure must 


return control before the Wait Coordinator can permit itself to again inspect the 


same event call channel. The following paragraphs show why the controls are 


needed and how they are achieved. 


It is easy to see how the Wait Coordinator could get into this situation. 
Suppose AP] is called for the first message of a call channel "1'' and suppose 
that during its execution AP1 must call ipc$block to await a message on some 
event wait channel. Further, suppose this message has not arrived at the time | 
of this fresh call to ipc$block, The Wait Coordinator might then find itself scan- 


ning channel 1 once again, and if it finds another message (assuming no controls 


were set to prevent it), would call AP] once again, We would then have a 


ok ; ; : , | 
“ Conventions used in calling associated procedure (AP) are: 


1. The AP has one argument, message ptr, which is a pointer to an eight-word based 
structure, the first six words of which are as given in Figure 7-10. 


Le Before issuing the call. the Wait Coordinator appends to these six words a word pair 
whose value is that of data ptr. This item, you recall, is the third argument fur- 
nished in the declaraticn of the event call channel. The data ptr can, therefore, be 
thought of as an ordina”y arglist ptr of a procedure, except it is available only in- 

. directly in the message argument. In addition, of course, AP has access to the 
first six words of the message area, which is also useful information. 


ipcSblock 
 (wait_list_ptr, 
message ptr, code) 


Employ the current ly-governing 
polling order to concatenate the 
wait list and the call list (to 
form a single list to be scanned). 


we 7 | 1 
( loop = ™\_ | rs, | 
one for any call 
list scan . more No 
i j advance eaapnets hesS$block 
Yes 
No message in 
| channel? 
Yes | | | 
| 2a : 4 
| Be Copy message 
channel type event swith channel's 
wait wait list index 
into caller's 
vent message area — 
call 6 | 
Call | | | ») 
Associated Procedure Set return 
(with the message ptr]|} © status in 
as the argument) code 


1s 


| Figure 7-15 How the ipc$block functions as a Wait Coordinator 
7 (Note that this is an expanded version of the flow 
chart given in Figure 7-12.) 7 oe @ 
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Ae 


-. situation that there are two invocations of API, each permitted to perform opera-- 


@ 


tions. over the same set of external (and static internal) variables. Since the first 


activation could be suspended after making incomplete alterations to such varia- 
bles, chaos could easily result. = Pe 


To avoid this kind of confusion, the Wait Coordinator associates and main- 


tains an inhibit flag for each event call channel. This flag is set immediately 


prior to the call to —and is reset immediately following the return from— the 


associated procedure. Moreover, event call channels that have inhibit flags set 
are ignored by the Wait Coordinator whenever the list of channels is scanned. 
This simple set of controls has been omitted from the picture given in Figure 7-15 


to keep things simple, but could be added simply by replacing box 6 with the | 


following amplification: 


6a 


eee inhibit flag set? 


No 


6c 


call 


Associated Procedure 
(with the message ptr 


as the argument) 


6d 


reset the 
inhibit flag 


(pases res, Other Channel Management Functions 


The subsystem designer who has a further need to know about event channels 
and their management will be pleased to learn that the IPC offers a number of 
other services. Using these capabilities, a user may, for instance, control the | 
polling order, delete as well as add new event channels, drain or flush out un- 


wanted messages from existing channels, cause a given list of channels to be 
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masked during certain periods, i.e., skipped over during normal scanning by the 
Wait Coordinator, convert event call channels back to the event wait category, | 


associate a given call channel with a new procedure and/or data pointer, and last, 


waiting), Details for making the appropriate IPC calls can be found in BJ. 10.01. 
4.5.4 _ Limitations of Multi-purpose Processes | | 


| The study of the Wait Coordinator has provided us with a new frame of 
reference for discussing multi-purpose processes. With additional study we can 


understand the potential as well as the limitations of such Multics Processes, 


Because each task of a Multics process must share the same stack with its 
sister tasks, the order in which events arrive and their time spacing clearly | 
determines the order in which tasks are started and completed. A feeling for 
this event dependency can be gained by studying timing diagrams for specific 
cases. We present in Figures 7-l16and 7-17 two cases, each for a multi-purpose 
process having four event call channels whose associated procedures are EC], 


EC2, EC3, and EC4. The respective priorities for these channels are assumed 


to be 1, 2, 3, and 4. 


7.5.4.1 | Two Case Studies Using Timing Diagrams” 

Figure 7-16 covers a period of time which commences while associated 
procedure EC] is executing. Events arrive for channels in order 2, 3, 4, 2, 1, | 
spaced as shown in the vertical time line on the left side of the figure. The 
vertical line segments in columns marked EC1, EC2, etc., correspond to execu- 
tion times for the respective tasks, which are carried out in the order 2, 3, 4, 

1, and 2. This is somewhat different from the order in which the event messages 
were received due to priority considerations, 

Figure 7-17 is a similar case, but exhibits one important complication, 
namely: at some point during its execution each task must make a call to 
-ipc$block to await a specific message (on an event wait channel). Arrival times 
for event wait messages are labeled ew)» ews, etc. | 

We gain valuable insight by "walking" through this timing diagram to see 
why things happen the way they do, 


* This section may be skipped on a first reading without loss of continuity.» 


Event 
Time Line 


Cf» 


ec 


ec 


ec 


ec 


Figure 7-16 


(5) 


Timing diagram showing execution of tasks EC1, 
EC2, EC3, and EC4 when triggered by events that — 
arrive at points in time labeled ec,, eCo, etc., as 
indicated on the (vertical) event time line. Cuircled 


numbers show the sequence in which tasks are 


ex 2cuted in virtual time. 


Event . 
time line 


ev 


ew 


ec. _. 


ec 


Figure 7-17 


Task Task Task Task = Stack History 
EC1 EC2 EC3. EC4 


eT 


> W's stack frames for the 
Wait Coordinator 


stack frames for 
respective tasks. 


Tiniing Diagram showing execution of tasks ECl, © 
EC2, EC3, and EC4,. This example assumes each © 
task executes one call to ipc$block to await distinct 
events, labeled ew], e€W2, EW; and ew4, respec-. 
tively. Note that a W-C polling order is assumed. 


° (1) When task EC1 has called the Wait Coordinator to await arrival 
of event ew), the Wait Coordinator discovers event ecz, so it 
triggers task EC2 which proceeds until it reaches a wait point. 


(2) During execution of EC1, the event ew) that task ECl has been 
waiting for, has arrived, but cannot be recognized by the Wait 
Coordinator when it is called by EC2 to await event ‘ewo. Con- 
sequently, the process is forced to block itself until the next 
recognizable event arrives, which, in our particular example, — 
is ec3. Arrival of this event causes the Wait Coordinator to 
invoke task EC3. 


The reason why the Wait Coordinator fails to recognize ew] 
when it is called by task EC2 is simple: The event ew) is not 
on the wait list of this call! The Wait Coordinator is in fact 
executing with a new stack frame. Hence, this activation of 
the Wait Coordinator will not be "looking for''ew,. When will. 
the Wait Coordinator again look for ew,;? In a moment we will 
have the answer to this question, but first let us continue our 
walk through Figure 7-17. | 


(3) As a result of invoking EC3, this task executes until it reaches 
a wait point and calls the Wait Coordinator with the wait list, 
ew3. Since this event has not yet arrived, and since no event 
call message is initially present, the process is again forced 
to block itself. 


: (4) The process is revived following arrival of ec4, at which time 
& task EC4 is invoked. 


(5) After going blocked again for a short period, ewy arrives. The 
Wait Coordinator recognizes ewy because it is on its wait list, 
so task EC4 resumes, executes to completion and returns to 
the Wait Coordinator. 


(6) A return (as opposed to a call) to the Wait Coordinator implies 
reversion to a preceding stack frame of the Wait Coordinator 
(i.e., toa prior activation). Execution in the prior activation 
means that the Wait Coordinator can now recognize the arrival 
of ew3. It may be noted that events ec? and ec, have also arrived. 
However, we are assuming a W-C polling order in the example. 
As a result, ew3 will be the first message discovered in the scan, | 
As a consequence, task EC3 is resumed and completed. 


(7) & (8) The above reasoning may be repeated to see how tasks EC2 and 
ECl may be completed in this order, | 


(9) Upon completion of task EC] and the return to the Wait Coorcina- 

| tor, events ec end ec, are discoverable, since there are no 
event wait messages onhand, The message for ec, is discovered 
first because it nas higher priority causing EC] to be invoked. 
(Thus endeth this 9-step walk. ) : | 


7.5.4.2 Ways to prevent Sluggish Event Wait Response of Event Call Tasks 
| Ways to preven ee 

A serious shortcoming of the multi-purpose process should now be evident. 
A difficulty may arise any time an event call task is forced to call ipc$block for | | © 
an expected event. Even though that event may arrive with reasonable dispatch, | 
there is no guarantee that the Wait Coordinator will 'respond'' ina reasonable 
| length of time by giving this task an opportunity to resume. For instance, 
suppose | | | | 

(a) Task EC1 calls ipc$block to event ew). | 


(b) The Wait Coordinator then invokes task EC2, and shortly thereafter 
ew] arrives. Then, : 


(c) Task EC2 calls ipc$block for an event ew9?; which takes an unexpec- 
tedly long time to arrive. 
Nothing can be done to give control back to EC] even though its awaited message 
has long since arrived, (Changing the polling order does not help.) The diffi- 


| culty stems from the fact that a stack history has been built up of the form: 


It is impossible to resume EC1 without doing an abnormal return, 1.¢€., from | 
W3to ECl. But this action would have the effect of aborting task EC2, which 


could cause chaos. 


Three approaches are open to the programmer to circumvent this problem. 


Two of these, (1) and (2), are less than fully satisfactory. 


(1) Program all associated procedures so that they and all their — 
dynamic descendents (if any) execute no calls ipc$block. (This 
will be difficult because if an associated procedure or any of 
sts descendents calls a system library routine or one written 
by another individual, there is no easy way to be sure, without 
reading the code, if said targets do or do not call ipc$block. ) 


(2) Program each a:;sociated procedure so that 


ae the first thing it does upon being called is to execute an 
ipc call that masks all event call channels, 1. Ge; 


call ipc$ma sk ev_calls (code); 


| Event Task Task Task Task © 
time line EC2 _ EC3 —EC4 EC4 


Figure 7-18 Ssme case as in Figure 7-17 except that calls to | 
ip-$mask_ev_calls and to ipc$unmask_ev_calls are 
made at points marked M and U, re spectively. 


(3) 


b. the last thing it does, prior to returning, is to execute an 
ipc call that unmasks all event call channels, i.€.,; 


call ipc$unmask ev calls (code); 


This approach has the merit that any task that is invoked is 
treated as having absolute top priority. In a sense, it can 
be regarded as an extreme approach to solving the problem. 
Figure 7-18 illustrates what we mean. Here we show the 
effects of masking and unmasking event call channels using 
the same event timing sequence as in Figure 7-17. Note 
that task EC4, because of its low priority, is not even begun 
during the time span being considered. | 


This approach will no doubt prove to be the most attractive: Give 
up trying the multi-purpose approach in the first place! Go back 
to the principal-design approach of Multics and let each task that 
is now programmed as an event call task and in need of better re- 
sponse from its Wait Coordinator be made into an independent _ 


- process to accomplish the same objective. Each such created 


process would have a single event call channel over which it can 
be signalled. Hence, competition for good response by its Wait 
Coordinator will now be eliminated. | | 


Bear in mind, however, that after establishing separate processes 
for the several tasks, these can now, in principle anyway, be 
executed in parallel, whenever two or more processors can be 
awarded to these tasks during a single period of time. The mere | 
fact that execution can proceed in parallel as a result of following 
this approach carries with it the need for greater care in the 


handling of shared data segments. 


